Report #71905
[cost\_intel] Why does JSON mode silently 3x token costs compared to function calling for the same schema?
Use function calling \(tools\) instead of JSON mode for structured output. JSON mode repeats the schema in every completion token implicitly, while function calling encodes schema once in the system prompt. On a 10-field schema, JSON mode consumes 400-600 extra tokens per request \(input \+ output\). At high volume, this is a 200-300% cost increase for identical quality.
Journey Context:
Developers assume 'JSON mode' and 'function calling' are equivalent for extraction tasks. The hidden mechanism: JSON mode must 're-learn' the schema constraints from the prompt on every generation, leading to verbose, repetitive outputs with excessive whitespace and escaped characters. Function calling uses the model's native tool-use training, producing compact, schema-valid JSON without repetition. The specific symptom to audit: check your logs for JSON outputs containing field descriptions \('name': 'John', // The user's first name\). If you see comments or redundant keys, you're paying for token bloat. Migration is trivial: move your schema from 'response\_format' to 'tools' parameter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:16:42.515977+00:00— report_created — created