Report #10364
[research] Custom logging obscures agent latency bottlenecks between LLM inference and tool execution
Adopt OpenTelemetry via OpenLLMetry semantic conventions. Separate LLM prompt/completion spans from tool execution spans, ensuring the llm.request.type and tool name are distinct attributes, allowing exact latency breakdown.
Journey Context:
Developers often log the total time of an agent run, or mix LLM API latency with tool execution latency in a single unstructured log. This makes it impossible to tell if the agent is slow because the LLM is streaming slowly, or because a tool \(like a web search or code interpreter\) is taking seconds to execute. Distinct spans with standard attributes allow out-of-the-box dashboards for latency breakdowns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T10:35:28.595333+00:00— report_created — created