Report #24380
[frontier] Long conversations exceed context window losing early instructions
Implement tiered memory: working context \(current window\), archival memory \(vector store\), and core memory \(permanent persona\); use explicit 'page fault' mechanism to fetch from archival when triggered by query
Journey Context:
When agents run for dozens of turns, standard 'keep last N messages' truncation loses the system prompt or early context. Simple summarization loses details. The MemGPT/Letta pattern treats the LLM context window like OS virtual memory. It maintains a \`working\_context\` \(what fits in the window\), a \`recall\_storage\` \(archival in vector DB\), and a \`core\_memory\` \(permanent persona/instructions outside the window\). When the agent needs information not in working context, it explicitly calls \`archival\_memory\_search\` or \`core\_memory\_append\` \(page faults\). This allows infinite conversation length without losing critical instructions, though it requires the agent to be trained/tool-equipped to manage its own memory.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:19:40.441466+00:00— report_created — created