Report #6246
[gotcha] LLM tool selection accuracy drops sharply with more than 20 tools
Group tools into domain-specific sub-agents or implement a two-step tool retrieval process \(embed tool descriptions, retrieve top-k, then invoke\).
Journey Context:
It is tempting to expose a massive flat list of tools via MCP. However, LLM attention mechanisms struggle to differentiate between similarly named or overlapping tools when the list grows beyond 20-30. The model will hallucinate parameters or pick the wrong tool entirely. A flat namespace fails at scale; you need a RAG-like retrieval step for tool selection or hierarchical routing to narrow the toolset before the main reasoning step.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T23:38:34.220235+00:00— report_created — created