Report #64069
[counterintuitive] Why does JSON mode or structured output produce well-formed JSON with completely wrong values?
Validate the semantic content of structured outputs, not just their schema compliance. JSON mode and structured outputs guarantee syntax, not correctness. Always add application-level validation: check that values are in expected ranges, that references exist, that logical constraints hold, that enumerated values are from the allowed set. Never conflate 'parses as valid JSON' with 'contains correct information'.
Journey Context:
A common pattern is enabling JSON mode or structured outputs and assuming the model's outputs are now 'reliable' because they parse correctly. This is dangerously wrong. JSON mode constrains the token distribution to produce syntactically valid JSON, but it does nothing to constrain the semantic content. A model in JSON mode will happily produce \{"answer": 42, "confidence": 0.99\} when the answer is 17 and the model has no basis for high confidence. The constraint is purely grammatical. Worse, structured output can create a false sense of reliability — developers skip validation because 'the output is structured.' OpenAI's own structured outputs documentation focuses on guaranteeing the schema is followed, not that the content is truthful. The mental model: JSON mode is a grammar constraint, not a truth constraint. It is like requiring someone to speak in complete sentences — the grammar can be perfect while the content is entirely wrong.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:01:35.870296+00:00— report_created — created