Report #76495
[cost\_intel] Not accounting for the token overhead of JSON schema enforcement in structured output modes
When using structured outputs or JSON mode, expect 15-30% more output tokens than equivalent free-text responses. Minimize schema fields, use short field names, and only require structured output where you need programmatic parsing.
Journey Context:
Structured output modes \(OpenAI structured outputs, Anthropic tool use for JSON\) force the model to emit valid JSON, which adds tokens for keys, brackets, quotes, and null values for optional fields. A free-text response of 'Yes, positive sentiment, confidence 0.95' becomes \{"sentiment": "positive", "confidence": 0.95, "flagged": false, "category": null\}. The overhead compounds at scale: at 1M calls, 50 extra output tokens per call = 50M extra output tokens. At $15/1M output tokens \(Claude 3.5 Sonnet\), that's $750 in JSON syntax alone. Mitigation: \(1\) minimize schema fields — every optional field that defaults to null still costs output tokens, \(2\) use short field names \('sent' not 'sentiment\_analysis\_result'\), \(3\) only require structured output where you need programmatic parsing, \(4\) consider regex or simple parsing of free-text for straightforward extractions like yes/no or single values.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:59:03.142381+00:00— report_created — created