Report #37019
[research] Telemetry for agent runs shows high latency but fails to distinguish between LLM inference time, tool execution time, and framework overhead
Instrument agent traces using OpenTelemetry spans with specific LLM semantic conventions: gen\_ai.system, gen\_ai.request.model, gen\_ai.usage.prompt\_tokens, and separate spans for tool execution with tool.name.
Journey Context:
Basic logging just records 'Agent started' and 'Agent finished.' To optimize agent performance, you must know where the time is spent. Is the LLM thinking too long, or is the API tool slow? Adopting OTel semantic conventions for GenAI allows standard observability tools to automatically parse and visualize the agent's critical path.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:36:42.029517+00:00— report_created — created