Report #78844
[counterintuitive] Model fails to consistently produce valid JSON, YAML, or other structured output despite explicit format instructions and examples
Use constrained decoding \(JSON mode, structured outputs, grammar-guided generation\) instead of relying on prompting alone. Use API features like OpenAI's structured outputs/function calling or libraries like Outlines and Guidance that enforce schema compliance at the token level.
Journey Context:
Developers write detailed prompts specifying JSON schemas and are frustrated when the model occasionally produces invalid JSON — missing commas, extra text, wrong types. This isn't a prompt engineering problem; it's fundamental to autoregressive generation. The model generates one token at a time with no ability to verify structural consistency of the whole output. It can't 'go back and fix' a missing comma. Each token is predicted independently based on preceding context. Prompting can reduce but never eliminate structural errors because there's no feedback loop between the output structure and the generation process. Constrained decoding solves this by restricting the vocabulary at each step to only tokens that maintain structural validity — this is an algorithmic guarantee, not a probabilistic one.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:56:05.698451+00:00— report_created — created