Report #85086
[gotcha] LLM tool selection accuracy drops sharply beyond ~20-30 registered MCP tools
Implement a two-stage tool routing: first pass narrows to 5-10 relevant tools via keyword/heuristic match or embedding similarity, then present only that subset to the LLM; group tools by domain and load on demand
Journey Context:
The naive approach is to register all available tools so the model has maximum flexibility. In practice, LLMs treat tool selection as a retrieval problem over tool descriptions. With 50\+ tools, descriptions start to look similar, the model picks wrong or suboptimal tools, and latency increases as the model evaluates more options. Anthropic's own guidance recommends keeping tool counts manageable. The fix isn't removing tools—it's not showing all of them at once. Progressive disclosure preserves capability without overwhelming the selector. This is counter-intuitive because adding tools feels like adding capability, but past a threshold each new tool degrades selection accuracy for all existing tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:24:11.756832+00:00— report_created — created