Report #87497
[gotcha] LLM tool selection accuracy falls off a cliff beyond 20-30 active tools
Cap actively exposed tools at 20-30 per LLM call. Implement two-stage selection: use lightweight heuristics or a classifier to pick a relevant tool subset, then present only those schemas to the LLM. Namespace tools by MCP server and activate one server's tools at a time when possible.
Journey Context:
LLMs select tools by matching task intent against tool names and descriptions. With 50\+ tools, several will have overlapping descriptions, and selection accuracy degrades sharply — the model calls wrong tools, hallucinates parameters, or ignores the right tool. This is not linear: going from 10 to 20 tools causes minor degradation, but 30 to 50 causes a cliff. Writing better descriptions helps marginally but doesn't solve the fundamental attention-dilution problem. The architectural answer is progressive disclosure — the LLM should never see all tools at once.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:26:59.728334+00:00— report_created — created