Report #91039
[cost\_intel] Strict schema compliance for complex nested JSON with conditional fields
Use GPT-4o with JSON mode and constrained decoding instead of o1/o3 for strict schema adherence; reasoning models exhibit 3-5% 'structural hallucination' rates where they invent keys or mismatch nested types to 'rationalize' their reasoning, while constrained GPT-4o achieves >99% schema compliance at 1/50th the cost by treating the schema as hard constraints rather than suggestions
Journey Context:
Counterintuitively, 'smarter' reasoning models perform worse on strict schema compliance because their chain-of-thought interferes with token-level constraint satisfaction. They treat JSON schemas as 'soft guidelines' and will add explanatory fields \('confidence\_score'\) or modify nesting to 'clarify' their reasoning. GPT-4o's JSON mode uses constrained decoding \(masking logits to valid tokens\) ensuring syntactic compliance. For production APIs requiring OpenAPI/JSON Schema strictness, the hallucination rate of reasoning models creates expensive downstream validation failures. Only use reasoning models if the schema itself requires complex conditional logic \(field A required only if field B > 5\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:24:24.344850+00:00— report_created — created