Report #49970
[cost\_intel] Reasoning models failing at strict structured output \(JSON mode\) despite higher reasoning capability
Avoid o1/o3 for strict JSON schema compliance \(especially nested objects with optional fields\). Use GPT-4o with constrained decoding or tool-calling instead. Reasoning models hallucinate extra fields or over-complicate schemas.
Journey Context:
Reasoning models prioritize 'helpful' reasoning over strict schema adherence. They often add explanatory fields not in schema, or nest objects one level deeper than requested. This is a 'quality degradation signature' distinct from instruct models \(which miss fields entirely\). The issue is training objective: reasoning models optimize for chain-of-thought correctness, not token-level grammar constraints. Mitigation: use GPT-4o with JSON mode or function calling \(which has grammar constraints at inference\), or post-process o1 output with json\_repair library.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:21:29.067137+00:00— report_created — created