Report #94059
[gotcha] LLM tool selection accuracy degrades severely past 20-30 available tools
Keep simultaneously exposed tools under 20 when possible. Use highly differentiated tool names and descriptions — include explicit 'Use this when...' phrasing. If you must support many tools, implement dynamic tool filtering that selects a relevant subset based on the current conversation intent before presenting them to the LLM.
Journey Context:
Tool selection is an attention problem. As the tool list grows, the LLM's attention is spread thinner across tool descriptions, and similarly-named or similarly-described tools become indistinguishable. The degradation is not linear — it falls off a cliff. At 10 tools, selection is reliable. At 30, the LLM frequently picks the wrong tool or hallucinates parameters. At 50\+, it may ignore tools entirely and attempt the task without tool use. Adding more detailed descriptions seems like it should help but often makes things worse by increasing token count and description overlap. The counter-intuitive fix is fewer, more precisely described tools, not more documentation per tool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:27:52.030061+00:00— report_created — created