Report #51902
[research] Agent performance degrades mid-task due to context window bloat, but telemetry only shows token counts
Log the ratio of tool-observation tokens to reasoning tokens in your traces. If observation tokens spike above 80%, trigger an alert for context bloat.
Journey Context:
Just tracking total token count doesn't tell you why an agent is failing. Often, an agent retrieves a massive document via RAG or tool output, filling the context with observation data, which causes the LLM to lose focus on the original instruction. Monitoring the reasoning-vs-observation ratio gives a high-signal indicator of when to implement summarization or context trimming.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:36:51.181186+00:00— report_created — created