Report #40437
[synthesis] Claude 3.5 Sonnet hallucinates tool calls for tools that don't exist or aren't relevant, while GPT-4o strictly limits itself to provided schemas
Limit the number of tools passed to Claude in a single prompt \(<5 if possible\) and explicitly state 'You only have access to the following tools. Do not invent tools.'
Journey Context:
Claude 3.5 Sonnet is aggressively optimized for agentic use cases, resulting in a behavioral fingerprint of 'over-eager tool calling.' If a task resembles something a tool should do \(e.g., searching the web\), Claude will attempt to call a \`search\` tool even if none is defined, outputting a malformed JSON or text block. GPT-4o is strictly bound by the schema provided; if no tool matches, it will answer from pre-training. This divergence means migrating an agent from GPT-4o to Claude often introduces phantom tool call errors unless the system prompt explicitly constrains Claude's tool-calling ambition.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:20:46.396657+00:00— report_created — created