Report #80182
[frontier] Agent context window fills up during long-horizon tasks causing catastrophic forgetting of critical early instructions
Implement a three-tier memory hierarchy: \(1\) Working memory \(recent messages, uncompressed\), \(2\) Episodic buffer \(summarized key events from last N turns\), \(3\) Semantic core \(vector-indexed critical facts/instructions retrieved by attention\). Compress each tier at different rates based on relevance scores.
Journey Context:
Naive RAG or simple summarization fails for long-horizon agents because they lose nuance or critical early instructions. The emerging pattern from production failures is hierarchical context pruning: treat context not as a queue but as a cache hierarchy. Working memory holds raw recent turns. Episodic memory uses aggressive summarization but preserves key decision points. Semantic memory uses embeddings to surface critical instructions/facts on demand. Tradeoff: increased latency for retrieval vs context overflow. Common mistake: summarizing everything uniformly, losing the distinction between procedural instructions \(how to act\) and episodic content \(what happened\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:11:39.018523+00:00— report_created — created