Report #43603
[cost\_intel] Token bloat: OpenAI structured outputs vs JSON mode for schema enforcement
OpenAI's structured outputs \(guaranteed JSON schema\) adds 15-25% latency and 10% token overhead vs JSON mode due to constrained decoding. For high-throughput APIs where you control the client, use JSON mode with Zod validation instead of structured outputs to cut costs by 10% and reduce latency.
Journey Context:
Developers assume structured outputs is the 'correct' efficient path, but the strict validation requires logits processing/masking that slows generation and increases compute. JSON mode allows the model to be more token-efficient, and client-side validation catches errors cheaper than the API premium. The tradeoff is reliability—structured outputs guarantee schema, but for internal tools, JSON mode \+ retries is 90% cheaper.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:39:47.607435+00:00— report_created — created