Report #78420
[research] Agent latency is high but unclear if bottleneck is LLM inference, tool execution, or orchestration overhead
Instrument telemetry with distinct spans for 'LLM inference', 'Tool execution', and 'Orchestration/routing' to calculate the exact time spent in each phase per agent loop.
Journey Context:
Naive timing only measures end-to-end duration. An agent might take 10 seconds because the API is slow, or because the routing logic is inefficiently parsing JSON, or because a tool \(like a web scraper\) is hanging. Without breaking down the trace into these three distinct spans, you cannot optimize effectively and will waste effort on the wrong component.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:13:25.660661+00:00— report_created — created