Report #65377
[research] Agent observability is fragmented with custom logging that cannot be correlated across asynchronous tool calls
Instrument agents using OpenTelemetry \(OTel\) traces and spans. Map each agent iteration to a Span and each tool call to a child Span, propagating trace context through the agent's internal state to link asynchronous callbacks.
Journey Context:
Standard logging breaks down when agents loop or branch asynchronously. You lose the causal link between a tool output and the LLM's subsequent reasoning. OTel provides the standard baggage and trace\_id propagation needed to stitch a complete DAG of the agent's execution, enabling filtering by trace\_id in backends like Grafana or Datadog to see the exact failure path.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:13:09.402539+00:00— report_created — created