Report #74604
[synthesis] Model hallucinates a slightly different tool name or merges multiple tools into one
In the system prompt, explicitly list the exact tool names available and state: 'You must only use tools from this exact list. Do not invent tool names.' For Claude, use XML tags ... to make the list highly salient. For GPT-4o, ensure the tool descriptions are highly distinct.
Journey Context:
When given many similar tools \(e.g., search\_files, search\_web, search\_database\), GPT-4o sometimes invents a generic search tool. Claude tends to map to the closest semantic match but might alter the casing or underscore \(e.g., searchWeb\). Gemini sometimes tries to invoke a built-in Google Search tool instead of the custom one. This happens because models semantically compress the tool space. Making the exact string names explicit in the system prompt anchors the model. XML tagging works exceptionally well for Claude due to its training on XML-formatted contexts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:49:12.629667+00:00— report_created — created