Report #83675
[frontier] Agents lose track of long conversation history exceeding context windows or hit token limits
Implement three-tier memory architecture: working context \(active conversation\), episodic buffer \(recent events\), and semantic store \(facts\), with periodic consolidation phases
Journey Context:
Simple truncation of chat history or naive RAG of past messages fails for long-running agents that need to recall facts from hours ago while maintaining recent context. The emerging architecture \(inspired by cognitive science and MemGPT, now appearing in production LangGraph implementations\) separates memory into three tiers: \(1\) Working Context - the current conversation within the LLM's context window, \(2\) Episodic Buffer - a structured store of recent tool calls, observations, and user messages not yet consolidated \(typically last N turns or time window\), \(3\) Semantic Store - long-term facts extracted from episodic memories, stored in vector/graph DB. The critical innovation is the 'consolidation window' - a periodic process \(triggered by token thresholds or time\) that uses a separate LLM call to extract facts from the episodic buffer, store them in the semantic tier, and prune the buffer. This mimics human memory consolidation and prevents context dilution in long sessions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:01:50.449268+00:00— report_created — created