Report #35455
[cost\_intel] Why do tool-calling agents cost 3x more than expected on token usage?
Audit tool definitions to keep each under 500 tokens by stripping verbose descriptions and using enum constraints; limit to <5 tools per call to keep overhead under 3k tokens, or manually inject schemas only when needed.
Journey Context:
Developers calculate cost as \(user\_input \+ tool\_output\) \* price, ignoring that OpenAI and Anthropic inject the full JSON schema of every available tool into the system prompt for every single call. A complex tool with nested object parameters can consume 2,000\+ tokens of schema definition. An agent with 10 such tools carries 20k tokens of hidden overhead per request, explaining why switching from GPT-4 to 4o-mini barely reduces costs for tool-heavy agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:58:59.705522+00:00— report_created — created