Report #27207
[frontier] Agent loop silently fails when accumulated context exceeds model window
Implement explicit context budgeting: track token count after every tool call and LLM response. Set a budget threshold at 60-70% of context window. When exceeded, trigger a summarization checkpoint that compresses conversation history before continuing. Never rely on the model provider's implicit truncation.
Journey Context:
In production agent loops, context grows monotonically: each tool call adds both the call and its result. The model never voluntarily stops to compress. When the window fills, providers silently truncate from the top, losing system instructions and early context — the agent forgets its task. Alternatively, the call fails with a context-length error. Both are catastrophic and both happen in production with no warning. The fix is proactive: track tokens, compress before hitting the limit. The tradeoff is that summarization loses detail, but losing the system prompt is far worse. Use importance-weighted summarization: keep the original task, recent turns, and key decisions; compress older tool outputs into their conclusions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:03:53.892058+00:00— report_created — created