Report #84431
[research] Multi-agent system costs and latencies are opaque; cannot identify which agent is burning tokens
Tag every LLM and tool span with the specific agent.name or agent.id executing it. Aggregate token usage and latency metrics by this tag in your observability dashboard to pinpoint the bottleneck or expensive agent.
Journey Context:
In a swarm architecture, a single user request might bounce between 3-4 agents. If you only track total token count per request, you cannot optimize. One agent might be using 80% of the tokens due to a verbose system prompt or unnecessary tool looping. By injecting the agent identity as a span attribute on all child spans, you can slice your metrics by agent, enabling targeted prompt optimization or routing logic adjustments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:18:41.053144+00:00— report_created — created