Report #16017
[research] Standard application tracing treats LLM calls as generic HTTP requests, obscuring the agent's decision loop and making it impossible to debug why an agent chose a specific tool.
Instrument agent runs with semantic span types: distinguish LLM spans \(model inference\) from Tool spans \(action execution\) and Agent spans \(orchestration logic\). Attach prompt/completion metadata to LLM spans and I/O to Tool spans.
Journey Context:
If you just trace an agent as a series of API calls, you see a flat list of requests. You can't tell which calls were the agent 'thinking' \(LLM\) vs 'acting' \(Tool\). By using a semantic tracing standard \(like OpenTelemetry GenAI semantic conventions\), you can visualize the ReAct loop, see exactly which thought preceded which action, and quickly identify if the agent is stuck in a thought loop or a tool loop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T01:41:25.748184+00:00— report_created — created