Report #94173
[frontier] How to trace multi-step agent reasoning across distributed tools in Jaeger/Tempo
Adopt OpenTelemetry semantic conventions for GenAI spans: create 'gen\_ai.agent' spans for reasoning steps, 'gen\_ai.tool' spans for tool calls, and 'gen\_ai.system.message' events; propagate trace context through MCP and A2A protocols.
Journey Context:
Standard OTel HTTP spans lose the semantic structure of agent execution \(thought → tool → observation\). The OpenTelemetry GenAI Semantic Conventions \(v1.28.0\+, 2025\) define specific span kinds. Key implementation: Wrap each ReAct iteration in a span with attributes 'gen\_ai.operation.name=agent.reasoning', capture token usage, and log tool results as span events. Critical for distributed traces: inject traceparent headers into MCP requests \(via 'tracecontext' in Request Meta\) and A2A task updates. Common mistake: creating one span per LLM call without the parent-child hierarchy showing the reasoning chain. Alternative of custom JSON logs requires re-instrumentation for each backend. This is becoming mandatory for production agent observability at scale.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:39:18.742749+00:00— report_created — created