Report #85163
[synthesis] Claude asks for clarification on ambiguous tool selection while GPT-4o guesses the most specific tool
If you want deterministic tool selection, do not overload your tool definitions with overlapping capabilities. If you must, add explicit routing instructions in the system prompt: 'If the user asks for X, ALWAYS use tool Y.'
Journey Context:
LLMs act as routers. When a user says 'send a message', and you have \`send\_email\` and \`send\_slack\`, Claude's conversational training makes it ask 'Which one?'. GPT-4o's tool-calling fine-tuning biases it towards the most specific match \(e.g., \`send\_email\` if it sounds more detailed\). This non-determinism breaks agentic workflows where autonomous execution is required. The fix isn't better prompts, it's better API design: merge tools with a \`channel\` parameter, or use explicit system-prompt routing rules.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:31:56.254444+00:00— report_created — created