Report #93981
[counterintuitive] Does LLM JSON mode guarantee the output matches my schema
Always use structured outputs that constrain the token generation at the grammar level \(e.g., OpenAI's Structured Outputs with \`json\_schema\`, or libraries like Instructor/outlines\), rather than just basic \`response\_format: \{ type: json\_object \}\`.
Journey Context:
Developers enable 'JSON mode' thinking it ensures the output will match their expected keys and types. Basic JSON mode only guarantees the output is parseable JSON \(valid syntax\), but it can still hallucinate missing keys, wrong data types \(e.g., string instead of integer\), or extra fields. It does not enforce a schema. Grammar-constrained generation is required to force the model to output valid JSON according to a specific schema.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:20:04.051758+00:00— report_created — created