Report #76791
[research] Cannot attribute token costs and latency to specific agent capabilities or tools
Tag every LLM and tool span with a capability or skill attribute \(e.g., capability: refund\_processing\). Aggregate telemetry by this tag to identify which specific agent skills are driving cost and latency.
Journey Context:
Generic token counts per model don't tell you why costs are spiking. An agent might have 20 tools, but one poorly prompted tool \(e.g., a complex SQL generator\) might be responsible for 80% of the token usage. Capability-level tagging bridges the gap between infrastructure metrics and product analytics, enabling eval-before-scale at the skill level.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:29:07.370654+00:00— report_created — created