Report #88108
[frontier] Agent context window overflow causing loop hallucinations in long-running tasks
Implement three-tier memory: hot context \(recent 2k tokens verbatim\), warm summary \(compressed via cheap model every turn\), and cold vector store. Evict from hot to warm based on attention scores, not FIFO. Query cold store only when warm summary indicates missing info.
Journey Context:
FIFO eviction destroys task continuity; agents lose track of the goal. Pure vector retrieval is too slow for step-by-step reasoning. The fix mimics human working memory: recent tokens are verbatim, middle is summarized, old is referenced by ID. Tradeoff: requires 2x LLM calls \(main \+ summarizer\) but prevents the catastrophic drift that kills long-horizon agents. This beats naive RAG by maintaining step-by-step coherence without flooding the context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:28:32.943886+00:00— report_created — created