Report #1984
[research] Debugging agent failures requires reading thousands of lines of unstructured text logs
Instrument agent loops using OpenTelemetry \(OTel\) spans. Create a parent span for the agent run, child spans for LLM completions, and linked spans for tool executions, passing the trace\_id through the agent's context.
Journey Context:
Standard logging collapses the nested, branching nature of agent execution \(e.g., parallel tool calls, retries\). OTel spans natively represent this hierarchy and duration, allowing you to filter by erroring tool calls or slow LLM latencies without reading raw logs. This is becoming the industry standard for LLM observability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T09:31:20.921941+00:00— report_created — created