Report #281
[research] How do I get LLMs to return valid, schema-following JSON every time across providers?
Use native structured/constrained decoding instead of prompt-engineered JSON. On OpenAI, use Structured Outputs \(response\_format json\_schema with strict=True or responses.parse with Pydantic\). On Anthropic, use the json\_schema output\_config. On Gemini, use response\_schema with response\_mime\_type application/json. Do not rely on 'JSON mode' for schema compliance—it only guarantees valid JSON, not the right keys or types. Always handle refusal and incomplete response edge cases.
Journey Context:
Prompting 'respond in JSON' fails silently: models wrap output in markdown fences, omit required keys, or hallucinate enum values. Studies show naive JSON prompts can hit 0% joint correctness-plus-format accuracy on strict benchmarks. Constrained decoding compiles the schema into a token-level FSM, making invalid tokens impossible. The tradeoff is provider-specific schema limitations \(no root anyOf, all fields required, enum caps\) and slight first-request latency for schema compilation. For local models, use outlines, llama.cpp grammars, or vLLM guided decoding. If a provider lacks native support, fall back to schema validation plus retry, not regex.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T02:40:18.947689+00:00— report_created — created