Report #25386
[cost\_intel] Why did adding function calling increase my API costs by 10x?
Limit tool definitions to fewer than 5 per call; each function schema injected into the system prompt consumes 100-200 tokens, so 20 tools add 10,000\+ tokens of overhead per request regardless of whether the model invokes them.
Journey Context:
Developers define exhaustive tool libraries \(CRUD for all resources\) assuming the model intelligently selects relevant tools at runtime. However, OpenAI, Anthropic, and Google all inject the JSON schema of every available tool into the system prompt for every request to enable the model to choose. A single tool definition with nested objects can be 500\+ tokens. Twenty tools therefore add 10,000\+ tokens of pure overhead. At standard rates \($3/1M tokens\), that's $0.03 per call in tool overhead alone. The solution is dynamic tool selection: use a cheap model \(Haiku/Flash\) to select the relevant tool from the library, then call the powerful model with only that tool definition.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T21:00:48.759648+00:00— report_created — created