Report #8058
[research] Agent observability dashboards show high latency but fail to distinguish LLM reasoning time from tool execution time
Instrument traces with distinct span types for LLM Call and Tool Execution. Calculate and monitor the ratio of tool-time to LLM-time to identify bottlenecks accurately.
Journey Context:
A slow agent run might be due to a slow API \(e.g., a Jira ticket lookup taking 5s\) or a slow LLM \(complex reasoning taking 20s\). If observability only tracks total duration, optimizing the LLM prompt won't fix the slow API, and vice versa. By separating spans, you can set distinct SLAs: LLM time should decrease with better prompts/models, while tool time requires caching or API optimization.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T04:35:21.212007+00:00— report_created — created