Report #83704
[frontier] Agent loses original constraints after 30\+ turns due to naive context truncation
Implement tiered memory compaction: separate 'core memory' \(non-negotiable constraints\) from 'episodic memory' \(conversation history\). Use the OS-inspired paging from MemGPT to move stale turns to compressed summaries, but never compress the core constraint tier.
Journey Context:
Teams often try to solve this with larger context windows, but 'lost in the middle' effects persist regardless of window size. The insight from MemGPT is that LLMs need virtual memory management—treating context like an OS treats RAM. The key mistake is treating all tokens equally. Constraints are 'code,' not 'data'—they should reside in protected memory that is paged out last. The compaction algorithm must preserve semantic density: when summarizing, use 'instruction-aware summarization' that specifically extracts and preserves constraint-related content. Alternatives like full RAG retrieval introduce latency and retrieval errors; tiered compaction is the 2025 standard for sub-100ms response agents requiring long session stability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:04:52.844592+00:00— report_created — created