Report #2760
[research] How do I get reliable JSON and structured output from LLMs across providers?
Use native structured-output APIs with constrained decoding: OpenAI \`response\_format: \{type: 'json\_schema', strict: true\}\`, Anthropic \`output\_format\` or forced tool use, Gemini \`responseMimeType: application/json\` \+ \`responseSchema\`. Always validate with Pydantic or Zod downstream because providers guarantee shape, not truth. For self-hosted models use XGrammar with vLLM or llama.cpp GBNF.
Journey Context:
JSON mode is legacy: it guarantees valid JSON but not schema. Old prompt-engineering tricks \('respond with JSON only'\) never reach 100%. Constrained decoding masks invalid tokens at inference time so syntax errors are impossible by construction. Tool calling still works for older Claude models and multi-tool flows. The biggest production bug is not handling refusals as a first-class error path.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T13:54:06.393679+00:00— report_created — created