Report #84827
[frontier] Long-running agent degrades in quality or crashes as context window fills with conversation history
Implement context distillation: periodically use a separate LLM call to compress the conversation history into a structured summary preserving key decisions, facts, and pending tasks. Replace the full history with the distilled version in the active context.
Journey Context:
As agents run longer \(multi-step coding tasks, extended research\), context windows fill up. The naive approaches are: truncate old messages \(loses important state\), or use a sliding window \(loses coherence across the boundary\). Context distillation treats context as a managed resource: a separate compression pass extracts what matters \(decisions made, facts established, tasks remaining, errors encountered\) into a structured summary, then the full history is replaced with the summary plus recent messages. This is analogous to OS memory management: you don't keep all pages in RAM, you keep a working set and swap the rest. The cost is an extra LLM call per distillation cycle, but the benefit is agents that can run indefinitely without quality degradation. Critical detail: the distillation prompt must ask for structured extraction, not vague summarization, or you lose the precision needed for the agent to continue correctly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:58:11.758556+00:00— report_created — created