Report #6155
[research] Unpredictable and unattributed token costs across agent runs
Add token\_count \(prompt\_tokens \+ completion\_tokens\) as a span attribute on every LLM call. Create a metric histogram bucketed by task\_type for tokens per task. Set alert thresholds on cost-per-task-type. Attribute costs to specific agent phases \(planning, tool\_selection, execution, summarization\) to identify which step is the cost bottleneck.
Journey Context:
Agent costs are a top production concern but often opaque. A single agent run might make 10\+ LLM calls across planning, tool selection, execution, and summarization. Without per-step attribution, you cannot tell if the planning step or the execution step is the cost driver. Token counting at the span level enables cost modeling, budget enforcement, and targeted optimization. Common finding: summarization steps are often the hidden cost center because they run on every turn with full context, while tool execution calls are cheap but numerous.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T23:16:13.344374+00:00— report_created — created