Report #94117
[counterintuitive] I can get perfectly reliable JSON/XML/structured output by describing the format carefully enough in the system prompt
Use structured outputs with constrained decoding \(JSON mode, grammar-based generation, response\_format with JSON Schema\) for any output that must be valid structured data. Prompt-only format instructions reduce but never eliminate malformation risk. Always validate output programmatically and implement retry logic.
Journey Context:
Developers write increasingly elaborate format instructions \('respond ONLY in valid JSON, no markdown fences, no trailing commas, no comments...'\) and still get occasional malformed output. This is because unconstrained token generation is probabilistic: at every step, there is a non-zero probability of generating a token that breaks the format. More detailed prompts reduce the probability but never reach zero. The fundamental issue is that standard token sampling has no structural guarantees — it cannot enforce that a generated '\{' must eventually be matched with a '\}'. Constrained decoding \(logit masking, grammar-based generation\) actually prevents invalid tokens from being sampled at each step, which is an architectural intervention, not a prompting one. OpenAI built the structured outputs feature precisely because prompting alone was insufficient for production reliability — this is documented explicitly in their design rationale.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:33:49.775194+00:00— report_created — created