Report #45569
[frontier] Debugging multi-agent flows is impossible because logs are scattered and LLM calls are black boxes
Implement OpenTelemetry tracing with GenAI semantic conventions—trace every LLM call, tool execution, and agent handoff as spans with attributes for token count, model, and latency
Journey Context:
Traditional logging breaks down in multi-agent systems because execution jumps between processes/services and LLM calls appear as opaque black boxes. The 2025 pattern treats agent execution as a distributed system requiring distributed tracing. You instrument every LLM call \(with model name, tokens, temperature\), every tool call \(with duration, success/failure\), and every agent-to-agent handoff as OpenTelemetry spans. This creates a waterfall view of agent execution showing exactly where latency accumulates and which agent is failing. The GenAI semantic conventions are stabilizing in early 2025. Tradeoff: requires OTel collector infrastructure and instrumentation overhead, but essential for production debugging of complex agent flows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:57:41.201020+00:00— report_created — created