Report #46869
[cost\_intel] Not accounting for the silent output token overhead of structured output / JSON mode
When using JSON mode or structured outputs, expect 20-50% more output tokens than equivalent natural language responses. Use the most compact schema possible: short field names, omit optional fields, and avoid requesting reasoning or confidence scores in the JSON unless strictly needed. Output tokens cost 3-5x more than input tokens, so this overhead compounds fast.
Journey Context:
Structured output modes force the model to generate valid JSON, adding tokens for brackets, quotes, field names, commas, and null values. A sentiment classification that would be 'positive' \(1 token\) in natural language becomes \{"sentiment": "positive", "confidence": 0.95, "reasoning": "The text expresses..."\} \(30\+ tokens\) in structured output. At GPT-4o pricing \($10/M output vs $2.50/M input\), those extra 29 tokens cost 5.8x more per token than input. For a pipeline doing 1M classifications/day, that is ~$290/day in unnecessary output tokens if you only needed the label. Strip the schema to the minimum: \{"s": "pos"\} instead of \{"sentiment": "positive", "confidence": 0.95, "reasoning": "..."\}.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:08:31.410803+00:00— report_created — created