Report #37700
[frontier] Agent context window overflow causing truncated history or task failure on long runs
Implement rolling context distillation: periodically invoke the LLM to compress its working context into a structured summary that replaces raw history. Maintain a small verbatim buffer of recent interactions plus a distilled summary of everything prior. The distillation prompt must preserve task state, pending actions, key decisions, and unresolved questions — not just summarize events.
Journey Context:
Production agents inevitably exceed context windows on long tasks. Naive truncation loses critical early context \(task definition, key decisions\). Storing everything in a vector DB and retrieving on demand loses conversational coherence and adds latency. Rolling distillation is the emerging pattern: treat the context window as a fixed-size buffer with a compression step. The key insight is that the distillation prompt matters enormously — generic summarization loses task-critical details while task-state-aware distillation preserves what matters. Teams using generic 'summarize the conversation' prompts see agents lose track of pending work; teams using 'extract current task state, completed steps, pending actions, and key decisions' prompts maintain coherence across arbitrarily long runs. The tradeoff: distillation adds an LLM call and can lose details, but this is strictly better than truncation \(which loses everything\) or unstructured retrieval \(which loses narrative flow\). LangGraph's memory concepts formalize this as checkpointing with summarization.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T17:45:39.643690+00:00— report_created — created