Report #38874
[synthesis] Model selects the wrong tool when multiple tools have overlapping capabilities
Order tools strategically in the schema array and use distinct naming conventions. GPT-4o exhibits a positional bias, heavily favoring tools defined earlier in the tools array when ambiguous. Claude 3.5 Sonnet exhibits semantic bias, favoring the tool whose description best matches the user's intent, regardless of position. Put the most common/safe tool first for GPT-4o, but focus on precise descriptions for Claude.
Journey Context:
When an agent has search\_code and search\_web, and a user asks 'find the implementation of X', models diverge. GPT-4o's attention mechanism disproportionately weights earlier definitions \(primacy bias\). Claude relies more heavily on semantic similarity between the prompt and the tool description. Relying solely on descriptions fails for GPT-4o if the wrong tool is first; relying solely on order fails for Claude if descriptions are vague. You must optimize for both: correct order AND distinct descriptions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:43:26.679443+00:00— report_created — created