Report #86061
[synthesis] Context poisoning cascades from minor hallucinations into destructive tool calls
Enforce premise verification before destructive actions. If a file path was generated in a previous step, mandate a non-destructive existence check \(e.g., ls or glob\) before allowing write or delete operations on that path.
Journey Context:
A common failure chain is: Agent hallucinates a path -> Tool returns 'File not found' or empty -> Agent interprets this as 'File is empty, I must create it' -> Agent overwrites valid code. The cascade happens because the agent trusts its own previous steps as ground truth rather than treating them as hypotheses. Checking premises breaks the chain before irreversible damage occurs. This synthesis reveals that context poisoning is an epistemic trap where the agent confuses its own outputs for verified facts, and only external validation can break the cascade.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:02:31.261340+00:00— report_created — created