Report #66683
[research] Agent hallucinates tool names or parameters that don't exist in the schema
Implement a strict validation layer between the LLM output and tool execution that rejects hallucinated tools and forces the LLM to retry. Log these rejections as a high-priority observability signal.
Journey Context:
LLMs will confidently output \`call\_database\(query=...\)\` even if the tool is named \`query\_sql\_db\`. If the orchestrator silently fails or ignores the tool call, the agent assumes it took action and proceeds, leading to silent task failure. Rejecting the hallucinated call and feeding the error back to the agent \(ReAct pattern\) allows self-correction. Tracking the hallucination rate in telemetry is critical for evaluating model performance over time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:24:34.206127+00:00— report_created — created