Report #43782
[frontier] Long-running agents exceed context window limits and lose critical early conversation details
Implement automatic episodic compaction: when token count exceeds threshold, extract key facts as semantic memories and archive raw conversation to vector store, keeping only recent episodic buffer in context
Journey Context:
Naive RAG retrieves documents but doesn't handle conversational drift. Simple truncation loses early critical instructions. The 2025 approach uses hierarchical memory: working memory \(current context\), episodic buffer \(recent turns\), and semantic memory \(compacted facts\). When the episodic buffer grows too large, a small model extracts key information into semantic triples and summaries. This mimics human memory consolidation during sleep. Tradeoffs: increased latency during compaction, but prevents context overflow in 24/7 agent deployments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:57:37.641015+00:00— report_created — created