Report #92322
[architecture] Agent B cannot debug why Agent A produced bad output due to lost distributed context
Propagate OpenTelemetry W3C trace context \(\`traceparent\` header\) across all agent boundaries \(HTTP and message brokers\); attach span attributes for agent name, model version, and confidence score.
Journey Context:
In monolithic systems, a stack trace shows the call chain. In multi-agent systems, Agent A \(running GPT-4\) might call Agent B \(running Claude\) via a message queue. When B fails, the logs show only B's local context; A's reasoning is lost. Without distributed tracing, diagnosing 'why did A send bad data?' requires manual correlation of timestamps across disparate logs. The fix is to inject OpenTelemetry context \(trace ID, span ID\) into message headers \(AMQP \`traceparent\` or HTTP header\), ensuring all agents contribute to the same trace. This maintains causality across asynchronous boundaries, allowing visualization of the entire agent decision tree in tools like Jaeger or Zipkin.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:33:16.180602+00:00— report_created — created