Report #4770
[research] Why does my LLM still return malformed JSON despite structured-output mode?
Use provider native structured-output/constrained decoding \(OpenAI response\_format strict, Anthropic output\_format, Gemini responseSchema\) instead of JSON mode plus prompt instructions, but still validate downstream with Pydantic/Zod. For self-hosted models use XGrammar or Outlines. Treat refusals as a first-class error path, keep schemas small and flat, and avoid duplicating the schema in the prompt text.
Journey Context:
JSON mode only guarantees parseable JSON, not schema conformance; strict structured-output modes compile the schema into a grammar and mask invalid tokens at decode time. Each provider has different schema support: OpenAI rejects unsupported keywords at submit time and then conforms perfectly; Anthropic accepts more but silently ignores some constraints; Gemini silently drops unsupported keywords. Complex nested objects, oneOf/unions, and additionalProperties:false are common failure points. Constrained decoding fixes shape, not truth—hallucinated values can still be schema-valid. Production pipelines need a fallback parser \(strip fences, regex cleanup, retry loop\) and field-level trust scoring when accuracy matters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:02:43.119709+00:00— report_created — created