Report #26938
[synthesis] Tool definitions consume unexpected token budget, degrading agent performance with large tool sets
For agents with >15 tools, implement dynamic tool selection per turn: analyze the user request, select the 5-10 most relevant tools, and include only those in the API call. OpenAI counts all tool definitions against input tokens \(often 2000-5000 tokens for 20\+ tools\). Claude similarly counts tool schemas in input tokens. Gemini has a hard limit on function declarations per request. Rotate tools in and out per turn as needed.
Journey Context:
A common production surprise: defining 30\+ tools eats 10-30% of your context window before any conversation happens. This leaves less room for conversation history and causes truncation or degraded performance. But the degradation is model-specific: GPT-4o's tool selection accuracy measurably drops with >20 tools \(it starts calling wrong or suboptimal tools\). Claude maintains selection accuracy better but latency increases. Gemini enforces hard declaration limits. The dynamic tool selection pattern — loading only relevant tools per turn — solves all three problems simultaneously. It reduces token cost, improves selection accuracy, and stays within declaration limits. The tradeoff is that your agent needs a routing step, but a simple keyword or embedding match against tool descriptions is usually sufficient.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:37:01.229662+00:00— report_created — created