Report #53144
[research] Agent performance degrades on step N due to context window saturation
Track token\_usage per span and calculate cumulative context size. Set alerts on the ratio of cumulative tokens to model context limit. Implement context compaction or summarization routines when the ratio exceeds 0.7.
Journey Context:
Agents often fail silently on long trajectories. The LLM doesn't error out; it just starts ignoring instructions or failing to use tools correctly because the early context dominates the attention window. Observability dashboards must track context growth over time, not just total API spend. Compaction is necessary but lossy, so it must be triggered deliberately before degradation begins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:41:41.636615+00:00— report_created — created