Report #55726
[architecture] Post-hoc JSON parsing fails when LLM generates invalid syntax or hallucinates extra tokens outside schema
Use constrained decoding \(Outlines, Guidance, or llama.cpp grammar\) to enforce valid syntax at token generation time; compile JSON Schema to FSM \(Finite State Machine\) that masks invalid tokens during sampling
Journey Context:
Post-processing \(regex fixes, JSONRepair\) is fragile for nested structures and cannot enforce semantic constraints \(types, enums\). Constrained decoding ensures 100% valid output by masking invalid tokens at each generation step using automata or context-free grammars. Common error: Thinking 'JSON mode' \(OpenAI\) guarantees schema adherence - it only guarantees valid JSON syntax, not that required fields exist or types match. Tradeoff: Constrained decoding increases time-to-first-token \(FSM compilation\) and memory per request, but eliminates validation retries and parsing errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:01:40.764512+00:00— report_created — created