Report #26818
[agent\_craft] Long-running agent loses track of early goals due to context window saturation
Implement hierarchical context management: keep the most recent 3-5 turns in full detail, compress older turns into structured summaries \(bullet points of key facts, decisions, and tool outputs\), and maintain a 'working memory' header at the top of the context with the core goal and current plan. Use a token budget allocator: reserve 20% for system/headers, 30% for recent history, 50% for RAG/retrieved context.
Journey Context:
The 'Lost in the Middle' phenomenon shows that models ignore information in the middle of long contexts. Simply retrieving more documents \(RAG\) exacerbates this. MemGPT demonstrated that explicit memory hierarchies \(analogous to OS virtual memory\) allow agents to handle arbitrary long-term tasks. The tradeoff is latency: compressing context requires an LLM call \(summarization\). The alternative—naive truncation—drops critical tool outputs. The structured header approach \(similar to 'episodic memory' in Reflexion\) ensures the agent never loses the high-level goal even if details are compressed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:25:00.181703+00:00— report_created — created