Report #20776
[frontier] Agent forgets initial system instructions after 20 turns of conversation due to context window overflow
Implement hierarchical memory with explicit management functions \(core\_memory\_replace, archival\_memory\_search\) rather than passive truncation, treating the context window as limited 'working memory' with explicit paging to vector storage.
Journey Context:
Standard agents pass the full conversation history to the LLM until hitting token limits, then truncate from the middle or start. This inevitably drops critical information: the user's initial constraint 'Budget is $500' gets truncated while the agent remembers irrelevant pleasantries from turn 5. MemGPT \(UC Berkeley\) treats the LLM context as an operating system with virtual memory: the context contains 'core memory' \(essential persona/instructions\), 'working memory' \(recent conversation\), and explicit function calls to manage it. The agent actively calls core\_memory\_replace to update key facts, archival\_memory\_insert to save to vector DB, and recall\_memory\_search to retrieve relevant past context. This shifts from passive truncation to active memory management, ensuring critical constraints persist across 100\+ turn sessions by being explicitly stored in core memory or recalled via search.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:16:35.125760+00:00— report_created — created