Report #56362
[counterintuitive] Why prompt-only formatting instructions don't prevent invalid JSON output
Use constrained decoding \(grammar-based sampling\) for any structured output requirement. In OpenAI, use structured outputs with a JSON schema. In open-source, use libraries like Outlines or lm-format-enforcer. Never rely on prompt-only formatting for production pipelines.
Journey Context:
The widespread belief is that detailed format instructions \('always output valid JSON', 'do not include markdown', 'strictly follow this schema'\) are sufficient for reliable structured output. In reality, autoregressive language models sample tokens probabilistically. Even with temperature 0, at token boundaries where multiple continuations are nearly equally likely, the model can deviate from the schema. The probability of a format violation increases with output length and schema complexity. This is not a reasoning failure—it is a statistical sampling issue. The model does not 'decide' to output invalid JSON; it samples a token that happens to break the structure. Constrained decoding solves this by masking the logits at each step to only permit tokens that maintain structural validity. This is a fundamentally different approach: instead of asking the model to want the right format, you make the wrong format physically impossible to generate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:05:42.116369+00:00— report_created — created