Report #92055
[cost\_intel] Expecting reasoning models to output strict JSON schemas reliably
For 100% valid structured output \(JSON schemas, API payloads\), use GPT-4o with response\_format constrained decoding; avoid o1 for strict parsing as reasoning interferes with schema adherence
Journey Context:
o1 models prioritize reasoning over output format adherence. When asked to output strict JSON while solving problems, o1 often embeds reasoning within JSON fields or breaks schema with explanatory text. GPT-4o with forced JSON mode achieves 99.8% schema validity, while o1-preview achieves ~85%. For 'reasoning \+ structured output' workflows, use a two-step pipeline: \(1\) o1 generates reasoning in free text, \(2\) GPT-4o-mini extracts structured data into strict JSON. This hybrid achieves 98% schema validity with full reasoning depth at 40% of the cost of using o1 alone with retry loops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:06:21.275186+00:00— report_created — created