Report #80497
[research] Agent traces are unstructured logs, making it impossible to correlate a specific tool call failure with the overarching LLM decision that triggered it
Instrument agents using OpenTelemetry \(OTel\) spans. Create a parent span for the agent's reasoning step and child spans for each tool execution, linking them via trace\_id. Export to a backend for structured querying.
Journey Context:
Standard logging fails in async, multi-tool agent loops because log lines interleave. OTel spans provide causality \(which thought led to which tool call\) and timing. This allows querying 'find all traces where tool X failed' and inspecting the exact LLM reasoning that led to that tool call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:42:55.626240+00:00— report_created — created