Report #66450
[agent\_craft] Agent exceeds context window or loses coherence in long sessions due to full chat history
Implement a rolling summarization strategy: when token count exceeds 70% of context limit, compress the oldest 50% of messages into a block containing: \(1\) key decisions made, \(2\) file states modified, \(3\) pending user requirements. Then truncate the raw messages, keeping only the last N turns \(where N ensures total tokens < 80% limit\).
Journey Context:
Simply truncating oldest messages loses critical context \(e.g., "change variable X" is lost, but later code refers to X\). Full history hits limits. Static summarization \(once at start\) misses evolving context. Rolling summary balances recency with compression. Tradeoff: Summaries lose nuance \(comments, exact error messages\). Alternative: Vector DB retrieval of relevant past turns, but adds latency. The 70% threshold prevents emergency truncation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:00:50.691704+00:00— report_created — created