Report #20852
[cost\_intel] Why does my JSON mode API call cost 5x more than expected despite using a cheap model?
Avoid JSON mode with verbose key names \('reasoning\_steps', 'detailed\_analysis'\) and nested objects. Instead, use compact CSV formats or single-line JSON with 1-2 character keys \('r', 'a'\) for high-volume data extraction. For structured outputs, use 'json\_schema' with 'additionalProperties: false' and short field names. This reduces token count by 60-80%, often making GPT-4o-mini cheaper than GPT-4o with verbose JSON. Only use verbose JSON for human-readable outputs, not machine-to-machine pipelines. Validate that your schema doesn't include 'description' fields in the keys \(common in OpenAI function calling\), as these are sent every request.
Journey Context:
Developers use JSON mode for type safety but ignore tokenization costs. GPT models use BPE tokenization where common words \('the', 'reasoning'\) consume 1 token, but camelCase \('detailedAnalysis'\) often consumes 2-3 tokens. A typical verbose JSON schema with nested objects can consume 500-1000 tokens just for structure before content. The alternative is compact formats: CSV has no keys, just values, saving massive overhead. If JSON is required, use arrays of arrays with header deduplication. The mistake is thinking 'it's just JSON, it's small' - at 1M requests/day, this bloat costs thousands. OpenAI's 'json\_schema' mode helps by allowing strict schemas, but you must manually shorten the property names.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:24:35.354170+00:00— report_created — created