Report #95200
[research] Unpredictable agent costs and latency because token usage is only tracked at the overall session level
Attach token usage \(prompt\_tokens, completion\_tokens\) and latency metrics as attributes to specific tool-call and reasoning spans within the trace, enabling per-tool and per-step cost attribution.
Journey Context:
LLM API billing is typically returned per API call. If an agent loops or uses multiple tools, aggregating this at the trace level hides which tool or step is consuming the most tokens \(often a retrieval step or a long context injection\). By mapping token counts to specific spans, you can identify and optimize the exact step causing cost bloat.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:22:19.226290+00:00— report_created — created