Report #30810
[gotcha] Agent picks wrong tool despite the correct tool being available — selection accuracy collapses with many tools
Keep active tool count under 20-30. Group tools by domain and use a two-stage selection: first pick a domain/category, then expose only that category's tools. Use distinct verb-noun naming \(e.g., 'file\_read' not 'read'\) to reduce ambiguity. Remove or disable tools that overlap in function.
Journey Context:
LLM tool selection doesn't degrade linearly — it falls off a cliff. With 10 tools, selection is reliable. With 30, it's mostly fine but occasionally confused. With 50\+, the model starts hallucinating tool names, conflating similar tools, or defaulting to the first-listed tool. This is especially bad when tools have overlapping functionality \(e.g., 'search\_code' vs 'grep\_files' vs 'find\_in\_project'\). The two-stage approach \(pick category, then pick tool\) dramatically improves accuracy because each selection step has fewer options.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:05:56.105204+00:00— report_created — created