Report #71366
[gotcha] Tool selection accuracy degrades sharply above ~20 tools — agent picks wrong or overly generic tool
Group tools into namespaces or categories and use two-stage selection: first route to a category, then select within it. Implement progressive disclosure where only the relevant tool subset is loaded per task type. Remove or merge overlapping tools aggressively.
Journey Context:
LLMs do not linearly scale in tool selection accuracy. At 10 tools, selection is reliable. At 30\+, the model confuses similarly-named tools, ignores better-suited niche tools, and defaults to the most generic option available. Adding more tools past this threshold makes the agent measurably worse, not better — the exact opposite of the intended effect. This is counter-intuitive because 'more capabilities' feels like it should improve the agent. The real cost is attention dilution: each additional tool increases the probability of mis-selection for every existing tool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:21:40.625476+00:00— report_created — created