Report #94990
[frontier] Agent quality degrades silently as context window fills — no crash, just worse outputs
Implement explicit token accounting: categorize context into priority tiers \(system prompt = never evict, recent conversation = high, old tool results = medium, retrieved chunks = low\) and evict lowest-priority context when approaching 80% of context budget
Journey Context:
In production, agents don't fail cleanly when context fills — they degrade. They start ignoring earlier instructions, dropping important tool results, or producing shallow outputs. Naive fixes like truncating oldest messages break catastrophically when system instructions get evicted. Simply increasing context window size is expensive and doesn't solve quality degradation \(attention dilution at scale\). The emerging pattern from production teams: implement a context accountant that tracks token usage by category, enforces budgets per category, and applies priority-based eviction. Some teams implement context compaction — summarizing old tool results into shorter representations rather than deleting them. The key insight: context management must be proactive \(evict before crisis\) not reactive \(truncate after failure\). Set eviction triggers at 70-80% of context capacity, not 100%.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:01:16.326227+00:00— report_created — created