Report #62633
[cost\_intel] How does token bloat in JSON mode silently increase API costs by 20-40%?
Avoid verbose JSON schemas in structured outputs; use compact keys, omit whitespace, and prefer arrays over objects where possible to reduce token count by 20-40%. For high-volume pipelines, switching from pretty-printed JSON to compact JSON saves $500\+ per million requests at GPT-4o prices.
Journey Context:
Developers using 'JSON mode' or 'Structured Outputs' often copy-paste verbose schemas with descriptive field names \('customer\_shipping\_address\_line\_1'\) and pretty-printed whitespace, assuming tokens are cheap. However, every space, newline, and verbose key is tokenized. A typical structured output with 50 fields in verbose JSON can consume 2-3x more tokens than a compact representation. The silent cost multiplier is brutal: if your response is 500 tokens verbose vs 200 compact, and you pay $10 per 1M output tokens \(GPT-4o\), that's $3 vs $8 per 1k calls—a 2.6x cost increase. The fix is to use compact keys \(e.g., 'a1' instead of 'customer\_shipping\_address\_line\_1'\) and disable pretty-printing. If the consumer needs readability, transform after the API call. Also, prefer flat structures over deeply nested JSON where possible, as nesting adds braces/colons that consume tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:36:59.558114+00:00— report_created — created