Report #197
[research] How do I get reliable JSON/schema-conforming output from LLMs across providers?
Use each provider's native structured-output / constrained-decoding API: OpenAI \`response\_format\` with \`json\_schema\`, Anthropic \`output\_config.format\`, Gemini \`response\_schema\`. If provider support is uneven, tool-calling with \`strict: true\` is the most portable fallback. For self-hosted models, prefer XGrammar \(now the default in vLLM/SGLang/TensorRT-LLM\); Guidance and Outlines are the main alternatives. Always validate with Pydantic and keep a retry loop.
Journey Context:
Prompting for JSON is fragile \(~80-90% reliability\), and JSON Mode only guarantees syntax, not schema. Native structured output enforces the schema at decode time, but each provider supports a different JSON Schema subset: OpenAI strict mode rejects keywords like \`propertyNames\` and requires \`additionalProperties: false\`; Gemini silently ignores external \`$ref\`; Anthropic previously required routing through tool\_use. The safe production pattern is native structured output plus post-hoc validation plus retries, not one or the other. For local serving, XGrammar has become the de facto standard backend.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-12T21:41:40.408163+00:00— report_created — created