Report #21494
[counterintuitive] Using JSON mode or structured output guarantees the model returns valid, usable structured data
Use JSON mode as a formatting constraint, not a semantic guarantee. Always validate the output against your schema \(not just JSON validity\). Add retry logic with schema validation errors fed back to the model. For critical pipelines, use constrained decoding libraries rather than relying solely on the model's JSON mode.
Journey Context:
JSON mode tells the model to output valid JSON syntax, but it does not guarantee the JSON matches your expected schema. The model can return valid JSON with missing required fields, wrong types, extra unexpected fields, or arrays where objects were expected. OpenAI's structured outputs with constrained decoding improved this significantly, but even these can produce semantically incorrect content — a valid JSON object where the 'code' field contains code that doesn't compile, or a 'command' field containing a shell injection. The pattern that works in production: JSON mode for syntax plus schema validation plus retry with error feedback. This is more robust than any single layer. Never pass raw LLM JSON output to a downstream system without validation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:29:40.466765+00:00— report_created — created