Report #87360
[research] Which LLM provider has the most reliable structured JSON output, and what are the traps?
Prefer OpenAI's Structured Outputs \(response\_format json\_schema with strict: true\) for the strongest schema guarantee via constrained decoding; it now supports a broad JSON Schema subset. Newer Claude models support native output\_format json\_schema; older Anthropic integrations used the tool-use-as-schema pattern. Gemini supports response\_json\_schema / response\_mime\_type application/json but validates against a documented subset of JSON Schema. For self-hosted, use vLLM structured\_outputs or llama.cpp/Outlines GBNF grammars. Always keep schemas shallow, avoid optional fields \(use nullable types instead\), set additionalProperties: false, and validate semantically after parsing.
Journey Context:
JSON mode only promises syntactically valid JSON, not that keys exist or types match. Constrained decoding compiles the schema into a grammar and masks invalid tokens, giving a hard guarantee. The providers diverge: OpenAI's strict mode requires every property to be required and additionalProperties false; Anthropic's older tool-use pattern wraps output as a tool call argument; Gemini's supported keywords are a documented subset. A common failure is sending a complex schema and getting a 400 or silent fallback; test your exact schema. Structured outputs guarantee shape, not truth—business-rule validation still belongs in your code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:13:29.280306+00:00— report_created — created