Report #86389
[research] Agent traces are opaque; impossible to debug which tool call or LLM step caused a failure
Instrument agent runs with OpenTelemetry \(OTel\) spans, specifically linking the LLM prompt/completion as span attributes and tool executions as child spans, exporting to a trace backend \(e.g., Langfuse, Jaeger, Arize\).
Journey Context:
Standard logging fails for agents because execution is non-linear and stateful. Without structured tracing, you cannot reconstruct the 'thought process' or state at the time of failure. OTel provides the standard for nesting tool calls under LLM decisions, making it possible to visually debug agent runs and measure latency per step.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:35:33.145572+00:00— report_created — created