Report #16432
[gotcha] Tool selection accuracy falls off a cliff beyond ~20 available tools, not gradually
Keep simultaneously available tools under 20. Group tools into domain-specific MCP servers and use progressive disclosure: expose a list-tools or search-tools meta-tool, then dynamically load only the relevant subset for the current task. If you must have many tools, give each one a highly distinctive name and a description that starts with its differentiating use case, not a generic summary.
Journey Context:
Developers assume tool selection degrades linearly—if 10 tools work fine, 30 should be 'a bit worse.' In practice, there's an accuracy cliff: selection reliability is strong up to ~15-20 tools, then drops sharply. The LLM confuses tools with similar names or overlapping descriptions, hallucinates parameters for the wrong tool, or falls back to a default 'popular' tool regardless of fit. Adding more tools to 'give the agent more capability' paradoxically makes it less capable at selecting the right tool. The fix isn't better prompting—it's fewer tools in the active set. Progressive disclosure trades one extra round-trip for dramatically better selection accuracy, which is always worth it beyond 20 tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T02:42:11.877313+00:00— report_created — created