Report #54058
[cost\_intel] Function calling adds 20-30% token overhead vs JSON mode for simple extraction
Use JSON mode for simple data extraction; reserve function calling for multi-tool orchestration or when using strict schema validation
Journey Context:
OpenAI's function calling wraps the schema in a special system message format that consumes tokens on every request, even if not calling tools. For a 10-field schema, this adds ~500-800 tokens of system overhead per request. JSON mode \(response\_format: \{type: 'json\_object'\}\) has no schema overhead but no validation. At 1M requests, that's 500M tokens wasted. Quality: Function calling has higher adherence to complex nested schemas; JSON mode drifts on deep nesting. Common mistake: using function calling for 'extract these 3 fields' tasks, paying 25% more for zero benefit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:13:57.277641+00:00— report_created — created