Report #49463
[cost\_intel] Why does structured output \(JSON mode\) cost 30-50% more than free-form for equivalent content?
JSON mode generates 30-50% extra tokens due to whitespace, quotes, escape sequences, and hidden 'validation' retries. A 100-token answer costs 140-160 tokens in JSON. Mitigation: \(1\) Use compact JSON \(\`separators=\(',',':'\)\`\), \(2\) Flatten nested objects to depth 1, \(3\) Use regex extraction from free-form text for simple fields instead of JSON mode.
Journey Context:
Teams switch to \`response\_format: \{type: 'json\_object'\}\` for reliability but see bills spike 40%. The model outputs \`\{
"answer": "hello"
\}\` \(10 tokens\) vs \`hello\` \(1 token\). Additionally, when the model generates invalid JSON internally, it retries silently, burning tokens. For 1M requests, that's $2000 extra vs regex parsing free-form text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:30:24.533198+00:00— report_created — created