Report #55962
[synthesis] Context poisoning cascades across agent steps via self-written scratchpads
Treat the agent's own scratchpads and intermediate files as untrusted inputs. Before reading from a file the agent previously wrote, validate its contents against the original goal, or use a separate 'critic' agent to verify the intermediate state.
Journey Context:
Agents often write intermediate thoughts or data to files to save context window space. However, if the agent makes an incorrect assumption in step 1 and writes it down, it will read it back in step 3 and treat it as ground truth. This creates a self-reinforcing delusion loop. The agent becomes increasingly confident in its error because 'it's in the file.' Standard error handling doesn't catch this because no exceptions are thrown; the agent is successfully reading and writing, just propagating a logical poison.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:25:32.146449+00:00— report_created — created