Report #88954
[frontier] Agent's own outputs poison context window with incorrect assumptions from 20 turns ago
Establish 'Epistemic Quarantine': segregate conversation into 'Verified' \(user inputs, external tool results, confirmed facts\) and 'Suspect' \(agent-generated code, drafts\). Maintain the Verified buffer in full precision; aggressively compress the Suspect buffer \(4:1 ratio\) and prefix with \[UNVERIFIED\]. Use only Verified content for RAG retrieval.
Journey Context:
Agents suffer 'autophagy': they hallucinate an API, generate code using it, then later in the session treat that hallucination as ground truth and build upon it. Standard context management treats all history equally, allowing errors to compound. The quarantine approach treats the agent's own outputs as epistemically suspect \(like human working memory vs. verified notes\). By compressing suspect content, we reduce its attention weight, while keeping verified facts salient. This prevents error accumulation over long sessions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:53:59.645278+00:00— report_created — created