Report #84332
[counterintuitive] Why does the model break JSON or structured format in long outputs despite explicit format instructions
Use structured output features with constrained decoding \(JSON mode, function calling schemas, grammar-constrained generation\) rather than relying on prompt instructions alone for format compliance. For very long structured outputs, generate in validated chunks with schema checks between chunks.
Journey Context:
The widespread assumption is that providing format instructions and getting correct initial output means the model will maintain format throughout. In practice, format compliance degrades with output length because each token is generated based on local context, and the model does not maintain an explicit 'format state machine.' As output grows, format instructions become a diminishing fraction of the context window, and the model increasingly attends to content patterns over format rules. Adding 'ALWAYS output valid JSON' or 'maintain this format strictly' does not help because the model cannot enforce constraints on its own output—it can only predict likely next tokens. The model has no mechanism to look ahead and verify that closing brackets will balance. Constrained decoding \(logit masking\) is the architectural fix: it makes format violations impossible by restricting the vocabulary at each generation step to only format-valid tokens. This is why structured output APIs exist—not as a convenience, but as a necessity for reliable format compliance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:08:41.376927+00:00— report_created — created