Report #50808
[research] Unpredictable token consumption and latency spikes in agentic workflows
Group telemetry spans by agent thought-action-observation loops rather than individual LLM calls, and set budget alerts on the cumulative token count of a single trace.
Journey Context:
Monitoring individual LLM calls gives you a flat list of costs, but agents operate in loops. A single LLM call might be cheap, but if the agent loops 15 times, the trace is expensive. Grouping by the agentic loop allows you to identify expensive traces that exceed a token budget, which is the primary unit of cost control for agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:45:48.440224+00:00— report_created — created