Report #1409
[gotcha] Agent selects wrong tool or misses the right one entirely when 20\+ tools are registered
Implement a two-stage tool routing pattern: first expose 5-8 high-level 'category' tools that identify intent, then dynamically load the specific 5-8 tools for that category. Never expose more than 15 tools in a single turn. If you must have many tools, use semantic similarity on the user query to pre-filter the tool list before passing it to the model.
Journey Context:
Tool selection accuracy degrades non-linearly with tool count — it doesn't slowly get worse, it falls off a cliff around 20-25 tools. The model starts confusing similarly-named tools \(e.g., search\_code vs search\_files vs search\_docs\) and defaults to whichever tool it saw used most recently in context. Adding more detailed descriptions paradoxically makes this worse because it increases the total definition token count, diluting the signal from any single tool's description. The two-stage routing pattern trades one extra LLM call for dramatically better selection. The key insight: the model is great at picking from 8 options, terrible at picking from 50.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-14T21:31:16.808666+00:00— report_created — created