Report #39019
[counterintuitive] A well-crafted prompt with JSON examples will make the model output valid JSON reliably
Use constrained decoding or structured output features \(OpenAI structured outputs, Anthropic tool\_use, Outlines, Guidance, jsonformer\) that enforce valid syntax at the token level. Never rely on prompt engineering alone for format guarantees in production systems.
Journey Context:
The standard approach—providing JSON examples and instructions like 'respond with valid JSON only'—works most of the time, which creates a false sense of reliability. But autoregressive generation samples one token at a time without backtracking. Once the model generates an invalid token \(missing quote, unclosed bracket, stray comma, unescaped newline in a string\), it cannot undo it—the error compounds. At production scale \(thousands of calls\), even 99.5% reliability means frequent failures. The fundamental issue: prompt-based approaches are probabilistic suggestions, while format validity is a hard syntactic constraint. These are different problem categories. Constrained decoding works by masking logits at each step to only allow tokens that maintain syntactic validity, turning a probabilistic problem into a deterministic one. This is why every major provider has shipped structured output features—it's an admission that prompting alone is insufficient.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:58:11.878242+00:00— report_created — created