Report #52120
[synthesis] GPT-4o fabricates plausible IDs in tool args, Claude omits required fields, Gemini conflates similar parameter names — different hallucination fingerprints require different mitigations
Implement model-specific parameter validation: for GPT-4o, add regex and format validation for any identifier or path parameters and reject fabricated values; for Claude, add required-field presence checks before tool execution; for Gemini, add parameter name disambiguation in tool schema descriptions and validate that semantically similar fields are not swapped. Most effectively: include concrete examples of correct parameter values directly in tool schema descriptions — this reduces all three failure modes significantly.
Journey Context:
Each model has a characteristic hallucination fingerprint when generating tool call parameters. GPT-4o's signature is plausible fabrication — it generates realistic-looking but fictional identifiers such as UUIDs, file paths, or API endpoints that follow the correct format but do not exist. This is particularly dangerous because downstream systems may accept the fabricated value and create errors or wrong data. Claude's signature is selective omission — it calls a tool but leaves out required fields, especially when there are many parameters, suggesting optimization for response speed over completeness. Gemini's signature is parameter conflation — it uses the value for parameter A in parameter B's slot when the parameters are semantically similar, such as putting the destination\_path value in source\_path. The synthesis: there is no universal anti-hallucination fix for tool calls. You must validate parameters differently based on which model generated them. The single most effective cross-model mitigation is including concrete examples of correct parameter values directly in the tool schema description field.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:58:35.055060+00:00— report_created — created