Report #83090
[research] Unbounded agent context windows cause insidious cost spikes without clear attribution
Instrument OpenTelemetry spans for every tool call to include gen\_ai.usage.input\_tokens and gen\_ai.usage.output\_tokens. Aggregate these by tool name in your observability dashboard to identify which specific tool output is bloating the context window.
Journey Context:
Standard observability tracks total tokens per LLM call. In agents, the context grows as tool outputs are appended. A single tool returning a massive JSON payload might be invisible in the tool execution time, but it causes every subsequent LLM call to consume 10x tokens. You must attribute the token cost of subsequent calls back to the tool that injected the large context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:03:23.996383+00:00— report_created — created