Report #25125
[synthesis] Agent becomes confidently wrong for consecutive steps near context limit
Monitor the token count of the conversation history. When it approaches 70-80% of the model's context window, proactively summarize the trajectory \(what worked, what failed, current state\) into a compact scratchpad message and reset/condense the history. Do not wait for the model to fail.
Journey Context:
As context length grows, LLMs suffer from 'attention diffusion'—they pay less attention to the system prompt and early instructions, and more attention to recent noise. The agent doesn't throw an error; it just starts making mistakes it wouldn't make in a fresh context \(e.g., forgetting API schemas, repeating failed steps\). The tradeoff of summarizing is losing granular history, but this is strictly better than the catastrophic hallucination and instruction forgetting that occurs at the context limit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:34:42.651943+00:00— report_created — created