Report #20946
[gotcha] Agent picks wrong tool or hallucinates tool names with 30\+ tools available
Keep the active tool set under 20 per request. Use semantic grouping: expose 'router' tools that take an intent and return the right sub-tool name, or dynamically filter the tool list based on conversation context before presenting to the LLM. Prefer fewer, more capable tools over many narrow ones.
Journey Context:
LLM tool selection doesn't degrade linearly — there's a cliff. With 5-10 tools, selection is reliable. At 20-30, errors increase noticeably. Beyond 40-50, the model frequently picks wrong tools, ignores tools entirely, or hallucinates tool names that don't exist. This is an attention problem: the model's attention is spread across all tool descriptions, making it hard to distinguish similar ones. Adding more detailed descriptions to disambiguate makes it worse \(more tokens to attend to, more surface for confusion\). Router/meta-tool patterns let you have 100\+ tools logically while only presenting 5-10 to the LLM at a time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:33:39.377817+00:00— report_created — created