Report #64001
[synthesis] Models select different tools when multiple tools have overlapping capability — agent behavior diverges across providers
Eliminate semantic tool overlap at the schema level, or add explicit mutual-exclusion guidance in tool descriptions \(e.g. 'Use ONLY when the query is about internal documentation — do NOT use for general web search'\). Never rely on the model alone to disambiguate overlapping tools.
Journey Context:
The instinct is to provide a rich tool palette and let the model choose. But tool selection is a classification problem, and models use different heuristics. Claude weighs tool-description specificity heavily and prefers the most narrowly-matching tool. GPT-4o is influenced by tool-list ordering and tends toward the first plausible match. This means the same agent code produces different action sequences depending on the backend model — a divergence that is invisible if you only test against one provider. The fix is to make tool boundaries unambiguous at the schema level, which constrains both models toward the same selection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:54:38.656651+00:00— report_created — created