Report #60651
[synthesis] Model hallucinates a tool that doesn't exist or invents parameters not in the schema
For GPT-4o, lower the \`temperature\` to 0 and use \`strict: true\`. For Claude, explicitly list the available tools in the system prompt and state 'Only use the provided tools.' If a hallucination occurs, return an \`is\_error: true\` tool result to correct the model.
Journey Context:
When an agent lacks a tool for a task, GPT-4o might hallucinate a plausible tool name \(e.g., \`web\_search\` instead of \`search\_internet\`\) or invent parameters to force a fit. Claude 3.5 Sonnet tends to hallucinate optional parameters that don't exist in the schema, or it will use a real tool but pass a JSON object as a string to a parameter expecting a primitive. The cross-model diff: GPT-4o hallucinates at the schema level \(inventing tools/params\), Claude hallucinates at the value level \(inventing values/types for existing params\). Both require client-side validation, but GPT-4o benefits from strict mode, while Claude requires explicit negative prompting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:17:28.816262+00:00— report_created — created