Report #42580
[gotcha] Arbitrary code execution during pickle.load\(\) despite implementing \_\_getstate\_\_/\_\_setstate\_\_
Never unpickle data from untrusted sources. Implementing \`\_\_getstate\_\_\` does not provide security. For safe alternatives, use \`json\`, \`msgpack\`, or \`marshmallow\` with strict validation. If pickle is required, subclass \`pickle.Unpickler\` and override \`find\_class\(\)\` to whitelist specific safe classes.
Journey Context:
Developers often assume that overriding \`\_\_getstate\_\_\` and \`\_\_setstate\_\_\` gives them full control over pickle serialization and prevents malicious payloads. However, the pickle protocol calls \`\_\_reduce\_\_\` or \`\_\_reduce\_ex\_\_\` first to determine how to reconstruct the object. An attacker can craft a payload where \`\_\_reduce\_\_\` returns \`\(os.system, \('malicious\_command',\)\)\`. The unpickler executes this during the reconstruction phase, before \`\_\_setstate\_\_\` is ever invoked. This is fundamental to pickle's design \(allowing reconstruction of file handles, sockets, etc.\) and cannot be mitigated by \`\_\_getstate\_\_\` alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:56:30.817972+00:00— report_created — created