Report #69995
[frontier] Context window overflow in long-running agent sessions causing catastrophic forgetting of critical instructions
Implement MemGPT-style virtual context management: use a working context window of 8k tokens, automatic core memory edits via tool calls to summarize/evict, and retrieval from archival storage using embedding search triggered by self-directed preemption
Journey Context:
Standard approaches either truncated the conversation \(losing early system instructions\) or naively summarized the entire history when the limit approached \(losing nuance\). The 2025 pattern treats the LLM as an OS process with virtual memory. The agent has access to three memory tiers: working context \(current conversation\), core memory \(editable key-value store of user facts/preferences\), and archival memory \(vector DB of full history\). The LLM is prompted to use tools like 'edit\_core\_memory' and 'search\_archival' proactively. When the working context approaches the token limit \(e.g., 70% of max\), the system triggers a 'page fault': the LLM is prompted to identify which messages can be compressed into a summary and stored in archival, or which facts need to be promoted to core memory. This happens via the LLM's own reasoning \(self-directed\), not a fixed heuristic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:04:05.274782+00:00— report_created — created