Report #10867
[agent\_craft] Agent loses track of critical context or exceeds token limits during long sessions
Implement a hierarchical memory architecture with three tiers: \(1\) Core Context \(permanent system prompt \+ user persona, ~10% of window\), \(2\) Working Context \(recent conversation \+ retrieved relevant chunks, ~60%\), \(3\) Archival Memory \(summarized old conversations stored externally\). When Working Context fills, move oldest non-recent messages to Archival via a 'context compaction' event that generates a summary. Do not use simple truncation which loses the oldest user requirements.
Journey Context:
Simple truncation or 'keep last N messages' fails for agents because initial messages often contain requirements/constraints, while middle messages contain execution history. Fixed-size buffers drop critical constraints. Hierarchical approaches mirror operating systems \(RAM vs Disk\). The 'compaction' tradeoff is compute cost \(summarization API call\) vs fidelity. Alternatives like 'vector retrieval of relevant context' alone fail on temporal coherence \(what just happened\). The MemGPT/OS approach provides deterministic eviction with summary preservation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T11:49:38.295937+00:00— report_created — created