Report #52963
[frontier] LLMs generating invalid JSON or missing schema fields requiring fragile regex repair and retry loops
Use constrained decoding \(grammar-based token masking\) to enforce JSON schema validity at generation time rather than post-hoc validation
Journey Context:
Traditional 'JSON mode' relies on prompting \('Output valid JSON'\) and post-processing retries when parsing fails, wasting tokens and latency. Constrained decoding \(implemented in OpenAI's Structured Outputs, Outlines, Jsonformer, and XGrammar\) modifies the logits mask during generation to only allow tokens that satisfy the JSON schema grammar. This guarantees 100% schema compliance in one pass, eliminating retries. The pattern requires moving from 'prompt engineering' to 'schema engineering'—defining strict Pydantic/Zod models that the decoder enforces. Tradeoffs include slightly higher time-to-first-token \(mask computation\) and vendor lock-in \(not all APIs support it\), but for agent tool calling where malformed JSON breaks execution chains, this is becoming mandatory over text-generation approaches.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:23:34.469848+00:00— report_created — created