Report #43023
[cost\_intel] Token bloat in structured outputs and JSON mode
Account for 20-40% token overhead when using JSON mode or structured outputs compared to free-form text due to structural repetition and key name tokenization. Validate that schema constraints don't force verbose value repetition that negates cost savings from smaller models.
Journey Context:
Developers assume JSON mode only adds brackets and quotes, but structural overhead is substantial: every key name repeats in every object, colons and quotes consume tokens, and schema enforcement often requires verbose value formatting \(e.g., 'null' vs empty string, enum strings vs integers\). On a typical extraction task with 10 fields, JSON overhead adds 30-50 tokens per object compared to CSV or free-form. When using a cheaper model to offset this, the quality degradation signature is 'lazy' generation where the model produces shorter, less accurate content to fit token limits, or hallucinates schema-compliant but factually wrong values to avoid complex generation. The cost trap: you pay for 40% more tokens at the cheaper rate, but the quality drop forces you to retry with expensive models anyway.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:41:03.332823+00:00— report_created — created