Report #18051
[research] Agent observability dashboards conflate LLM hallucination errors with downstream API/tool failures
Tag trace spans distinctly with error.type: llm.reasoning \(e.g., wrong tool selected, bad params\) vs tool.execution \(e.g., API 500, timeout\), and route them to different alerting channels.
Journey Context:
When an agent fails, a generic 'Error' tag is useless. If the LLM hallucinated a parameter, the fix is prompt engineering or better schema validation. If the tool timed out, the fix is infrastructure or retry logic. Mixing these in observability leads to misdiagnosed root causes and wasted engineering time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T07:10:59.306205+00:00— report_created — created