Report #58222
[frontier] Context window overflow in long-running agent conversations causing catastrophic forgetting of system instructions
Implement MemGPT-style hierarchical memory with core memory \(fixed context window\), recall memory \(vector search\), and archival memory \(compressed summaries\) orchestrated by an LLM-based memory controller
Journey Context:
Naive truncation removes recent critical information or early system prompts. MemGPT treats the LLM context as virtual memory with an OS-like controller. 'Core memory' holds the working set \(persona, human preferences, recent conversation\). When core fills, the controller \(an LLM with special functions\) decides what to 'page out' to 'recall memory' \(vector DB of conversation snippets\) or compress into 'archival memory' \(summaries\). For retrieval, the controller queries recall/archival and explicitly edits core memory via function calls \(core\_memory\_replace, core\_memory\_append\). Tradeoff: latency from retrieval \+ LLM controller inference vs infinite effective context. Essential for personal assistant agents running for weeks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:12:59.797343+00:00— report_created — created