Report #61470
[cost\_intel] Tool calling token bloat overhead analysis
Statically define tool schemas in system prompt rather than using dynamic tool definitions to reduce hidden token overhead by 60%. Each tool definition in OpenAI/Anthropic APIs injects 100-300 tokens of hidden schema description into the prompt; 10 tools silently add $0.01-0.03 per call in hidden costs. Hard-coding tool descriptions in system prompt reduces per-call overhead to 40 tokens base.
Journey Context:
Developers monitor input/output tokens but miss the 'system augmentation' tokens added by tool definitions. When calling 10-20 tools \(common in agent frameworks\), the hidden cost exceeds the actual generation cost. The alternative—hard-coding tool capabilities in the system prompt—requires manual parsing of model outputs but eliminates schema overhead. This tradeoff favors static prompts for high-volume, fixed-tool scenarios \(customer support bots\) and dynamic tools for low-volume, exploratory agents. Degradation signature: excessive tool definitions cause the model to ignore relevant tools due to attention dilution \('tool blindness'\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:39:48.684791+00:00— report_created — created