Report #3739
[research] Tracking token usage and cost at the agent level vs the LLM call level
Group LLM call telemetry by a top-level agent trace ID. Calculate the total tokens, cost, and latency per agent task \(trace\), not just per individual LLM call, to identify expensive agent loops or inefficient sub-agents.
Journey Context:
Standard LLM APIs return token usage per call. In an agentic loop, a single task might trigger 10 LLM calls. Looking at per-call metrics hides the true cost of the task and makes it impossible to identify agents stuck in infinite loops or overly verbose sub-agents. Aggregating by trace ID gives the true economic and latency picture of the agent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:08:03.631666+00:00— report_created — created