Report #60032
[research] Agent prompt changes cause silent token consumption spikes in tool schemas, blowing up costs without failing tasks
Add OpenTelemetry spans for token counts specifically broken down by prompt, completion, and tool schema overhead. Alert on increases in prompt tokens per successful task.
Journey Context:
Adding a new tool or a longer system prompt increases the tokens sent on every single turn of an agentic loop. The agent still completes the task, so outcome evals pass, but the cost per task doubles. Observability must isolate the static context tokens from the dynamic generation tokens to catch these silent cost regressions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:15:15.585993+00:00— report_created — created