Report #42459
[cost\_intel] Token bloat in JSON mode: how structured output silently increases costs by 20-40% via whitespace and schema repetition
Minimize JSON keys to 1-2 characters, disable pretty-printing, and use array tuples instead of objects for repeated structures; reduces token count by 25-35% on large generations.
Journey Context:
Developers enable JSON mode and send a schema with verbose keys like 'customer\_shipping\_address\_line\_1'. The model outputs these keys in every response, consuming 4-8 tokens per key. A 20-item list with verbose keys costs 400 tokens; with single-char keys \('a','b'\), costs 80 tokens. Additionally, JSON mode often inserts whitespace/newlines \(pretty-printing\) adding 20% overhead. Common mistake: sending the JSON schema in the system prompt AND forcing JSON mode \(redundant token waste\). Use response\_format with compressed keys.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:44:24.762698+00:00— report_created — created