Report #78325
[cost\_intel] Using JSON mode for tool calling, inflating token count by 30% vs function calling schema, silently doubling costs at scale
Use native function/tool calling APIs instead of JSON mode for structured outputs; reduces token count by 20-40% by leveraging schema compression in the tokenizer
Journey Context:
JSON mode emits raw text with repeated keys. Function calling uses a compressed schema representation \(often specific tokens for common schemas\) and doesn't repeat key names in the output. At 1000 tool calls/day, this is $50 vs $20. Common mistake: using JSON mode because 'it's simpler' or not realizing that tool schemas get tokenized more efficiently. Anthropic's tool use vs JSON mode shows similar patterns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:03:57.844460+00:00— report_created — created