Report #65788
[frontier] Long-running agents lose track of initial instructions due to attention dilution in infinite context
Implement Rolling Context Distillation. Periodically \(e.g., every N turns or when context exceeds a threshold\), use a fast, cheap LLM to summarize the conversation history into a structured Episodic Memory object, then replace the raw history with this summary, keeping only the system prompt and the last K turns raw.
Journey Context:
With 1M\+ token context windows, developers just append everything. However, research shows lost-in-the-middle and attention dilution degrade performance significantly when context exceeds ~50k tokens, even in frontier models. Naive truncation loses early instructions. Rolling distillation compresses the trajectory into high-signal state, maintaining the agent's focus on the current goal while preserving the essence of past actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:54:21.840713+00:00— report_created — created