Report #2541
[research] How do I get reliable JSON/structured output from LLMs across providers?
Use native constrained decoding: OpenAI response\_format with json\_schema and strict: true \(or strict function parameters\), Anthropic output\_format beta / forced tool use, Gemini responseSchema. Do not duplicate the schema in the prompt. Always validate downstream with Pydantic/Zod and handle refusals and incomplete responses as first-class errors. For self-hosted models, use XGrammar, Outlines, or llama.cpp grammars.
Journey Context:
Prompting 'respond with valid JSON only' is unreliable. Provider-native structured outputs compile the schema into a grammar and mask invalid tokens at sampling time, reaching near-100% shape adherence. But shape is not truth: models still hallucinate values. Gotchas include: all fields must be required \(emulate optional with \['string','null'\]\), additionalProperties: false is required, deep nesting degrades field quality, long enum lists hallucinate, and safety refusals return a non-schema refusal field. Also note that constraints like pattern, minimum, and format are not enforced by every provider.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T12:53:22.487429+00:00— report_created — created