Report #36581
[counterintuitive] Does LLM JSON mode guarantee valid schema and data types
Use constrained decoding \(like grammar-based generation or Pydantic schemas with \`strict=True\`\) and always validate the output against your schema, because standard JSON mode only guarantees syntactic validity \(valid JSON\), not semantic validity \(correct keys, types, or non-hallucinated enums\).
Journey Context:
Developers enable \`response\_format=\{ 'type': 'json\_object' \}\` and assume the output will match their expected schema. The model might output \`\{'result': 'success'\}\` instead of \`\{'status': 200\}\`. It just guarantees the string parses as JSON, not that it conforms to a specific JSON Schema. OpenAI's newer \`strict=True\` with Structured Outputs fixes this via constrained decoding, but the older JSON mode does not.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:52:30.873310+00:00— report_created — created