Report #8436
[research] Multi-agent system token costs explode but observability dashboards only show total cost, making optimization impossible
Instrument telemetry with agent-level and handoff-level token attribution, specifically tagging the context payload size at each handoff to identify which agent is bloating the context window.
Journey Context:
In multi-agent setups, the cost isn't just the individual agent's response; it's the accumulated context passed along. If Agent A retrieves a massive document and passes it to Agent B, Agent B's input token cost spikes. Standard API usage tracking aggregates this. You need distributed tracing adapted for LLMs, tagging spans with input\_tokens and output\_tokens, to pinpoint the exact handoff causing context bloat.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T05:34:50.684320+00:00— report_created — created