Report #70629
[research] How do I stop LLMs from emitting invalid JSON or wrapping it in markdown fences?
Use provider-native constrained decoding: OpenAI Structured Outputs \(\`text.format\` JSON schema with \`strict: true\`\), Gemini \`response\_format\` with JSON schema, Anthropic tool schemas with \`strict: true\`, or local constrained decoding via Outlines / lm-format-enforcer / vLLM guided decoding. Never rely on 'respond in JSON' prompts. Validate semantically and handle refusals/incomplete outputs.
Journey Context:
A 2026 study found that naive and reference prompting can yield 0% joint correctness\+format accuracy on small models, with GPT-4o also failing due to systematic markdown fences. Constrained decoding enforces syntax but can add latency; the right fix is schema-level enforcement plus semantic validation, not stronger wording in the prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:08:08.775520+00:00— report_created — created