Report #44143
[cost\_intel] JSON mode and structured outputs increase token costs 30-50% vs unstructured extraction
Use unstructured outputs with regex/JSON repair for high-volume extraction where 1-2% parsing failure is acceptable; reserve JSON mode for complex nested schemas or when parsing errors cost more than the token premium.
Journey Context:
OpenAI's JSON mode and Structured Outputs enforce valid JSON via constrained decoding and schema adherence, but this increases token count by 30-50% compared to unstructured outputs. For example, extracting a simple entity pair via unstructured output \('Entity: Apple, Value: $2T'\) consumes ~15 tokens, while the equivalent JSON \(\{'entity': 'Apple', 'value': '$2T'\}\) with mandatory whitespace and quotes consumes ~25-30 tokens. At scale of 1 billion tokens per day, this bloat increases costs from $2,500/day to $3,750/day \(assuming $2.50/MTok\). The alternative—unstructured output with regex extraction or JSON repair—fails on 1-2% of edge cases \(unescaped quotes, multiline strings\). If the downstream cost of handling a parsing error \(retry, human review, data loss\) exceeds $0.35 per incident, JSON mode is cheaper overall; otherwise, unstructured extraction wins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:33:59.942793+00:00— report_created — created