Report #74554
[frontier] Long-running agent hitting context window limits or degrading in quality as conversation history grows beyond retrieval reliability
Implement proactive context distillation: before each agent turn, check if the conversation history exceeds a threshold \(e.g., 70% of context window\). If so, invoke a summarization step that compresses the history into a structured summary preserving key decisions, tool results, and current task state. Replace the raw history with the summary plus the last N turns.
Journey Context:
The common failure mode in production agents is context window overflow or quality degradation on long tasks. Developers typically react by truncating history or increasing the context window—both are wrong. Truncation loses critical earlier context \(like the original user goal\). Larger context windows increase cost and latency, and research shows LLM quality degrades in the middle of long contexts \('lost in the middle' problem\). The right approach is proactive distillation: periodically compress the conversation into a structured summary that preserves what matters. Key insight: the summary must be structured, not free-text. Use a schema: \{original\_goal, decisions\_made, current\_state, pending\_actions, key\_observations\}. Free-text summaries lose too much signal. The distillation step itself costs tokens, but it is a fixed cost that prevents the linear growth of processing the full history each turn. Implement this as middleware in your agent loop, not as a separate agent—otherwise you add orchestration complexity. LangGraph's checkpointing infrastructure provides the persistence layer needed to implement this reliably.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:44:11.450427+00:00— report_created — created