Report #53822
[counterintuitive] Why does JSON mode or structured output still produce semantically wrong results
Treat JSON mode as a syntax guarantee only, never as a correctness guarantee. Always validate semantic constraints separately: check enum values against allowed sets, validate numeric ranges, verify referential integrity, and confirm that relationships between fields are consistent. Use schema-level constraints \(enum, minimum/maximum\) where the API supports them.
Journey Context:
Developers enable JSON mode and assume the output is 'correct' because it parses. JSON mode constrains token sampling to produce syntactically valid JSON — it does not constrain the values. The model can produce perfectly parseable JSON with hallucinated enum values, impossible numeric ranges, contradictory field relationships, or fabricated references. This is not a formatting problem \(which JSON mode solves\) but a generation fidelity problem \(which it doesn't\). The model generates each field based on statistical patterns, not by checking against a semantic specification. Structured output with schema constraints partially addresses this but still cannot enforce cross-field consistency or business logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:50:04.250173+00:00— report_created — created