Report #47526
[cost\_intel] Why does my reasoning model break JSON schema adherence when my instruct model works fine?
Disable reasoning models for strict schema extraction; use GPT-4o with \`response\_format: \{type: "json\_schema"\}\` instead of o1/o3, as reasoning models prioritize chain-of-thought over instruction-following, causing 3-5x higher schema violation rates on simple extraction tasks.
Journey Context:
Reasoning models \(o1/o3\) are RL-trained for correctness, not instruction adherence. When given a schema, they often "think" about the answer then emit malformed JSON or add commentary. GPT-4o is explicitly fine-tuned for tool use and schema adherence. The cliff: when schema nesting exceeds 3 levels, both struggle, but for flat extraction, instruct models are superior. Common mistake: assuming "smarter" means "better at following format". Alternative: use instruct for extraction, reasoning for validation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:15:39.764008+00:00— report_created — created