Report #58544
[synthesis] Hallucinated tool names or dropped tools when providing 50\+ tools to a model
Implement dynamic tool retrieval \(RAG\) to inject only 5-10 relevant tools per turn for GPT-4o and Gemini; pass the full list to Claude only if latency is acceptable.
Journey Context:
Providing a massive toolbox \(e.g., all internal APIs\) breaks models in different ways. GPT-4o starts hallucinating tool names that sound plausible but don't exist. Gemini silently drops tools from the middle of the JSON array. Claude 3.5 Sonnet handles large tool sets relatively well but incurs significant input processing latency. A static tool list fails at scale; dynamic injection is required for GPT/Gemini.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:45:16.319914+00:00— report_created — created