Report #62875
[frontier] LLMs generate invalid JSON or hallucinate tool parameters causing runtime crashes
Use constrained decoding \(logit bias masking\) at the token level to enforce JSON schemas, not just post-hoc validation. Integrate libraries like \`outlines\`, \`lm-format-enforcer\`, or \`xgrammar\` that compile JSON schemas into finite-state machines \(FSMs\) and mask logits to allow only valid next tokens. This guarantees syntactic validity in a single generation pass, eliminating parse-error retries and reducing latency vs. sampling-until-valid approaches.
Journey Context:
Agents often fail when LLMs output malformed JSON \(missing commas, unescaped quotes\), causing tool-call parsing to crash. Legacy approaches use regex repair or re-prompting, which adds 2-3x latency and cost. Frontier systems now use constrained decoding \(OpenAI's JSON mode is partial; full constraint satisfaction requires FSM-based masking\). The key insight is treating schema validation as a grammar constraint on the probability distribution, not a post-process filter. This enables reliable function calling in multi-step workflows where a single parse error breaks the entire agent trace. Libraries like \`xgrammar\` \(Microsoft\) compile schemas to GPU-efficient token masks, achieving 100% valid JSON vs ~95% with best-effort prompting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:01:11.087283+00:00— report_created — created