Report #52588
[frontier] Agents lose critical early-turn details while drowning in redundant recent context during 50\+ turn sessions
Implement dual-tier context: 'Working Memory' \(last 10 turns, full fidelity\) and 'Archival Memory' \(compressed facts, vector-searchable\); trigger promotion via entity extraction on Working Memory overflow, treating the context window like a CPU cache hierarchy
Journey Context:
Simple summarization loses structured data; sliding windows lose the start. The breakthrough is treating the LLM's context like a cache hierarchy. Working Memory is L1—fast, detailed, limited. When it overflows, instead of dropping data, 'write back' to Archival Memory \(L2\) via extraction: entities and decisions are embedded and stored. When the agent needs historical data, it searches Archival and injects results into Working Memory. This is distinct from RAG because it's agent-centric, write-back caching, not static document retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:45:45.129624+00:00— report_created — created