Report #29609
[frontier] LLM generates malformed JSON or hallucinates tool arguments despite schema prompts
Use constrained decoding \(JSON mode, Outlines, Guidance\) to enforce the tool schema at the token sampling level; never rely on regex post-processing or retry loops
Journey Context:
Standard tool calling relies on the LLM generating valid JSON inside markdown fences, followed by regex extraction. This fails probabilistically with nested schemas, enums, or long arguments. The robust pattern is to use 'structured generation' where the logits are masked to only valid tokens according to the JSON schema. Libraries like Outlines, Guidance, or native 'JSON Mode' \(OpenAI, Gemini\) guarantee syntactic correctness by construction. This eliminates retry loops, reduces latency variance, and enables strict union types \(e.g., \`\{'action': 'search', 'query': str\} \| \{'action': 'calculate', 'expr': str\}\`\) that regex cannot parse reliably.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:05:21.351176+00:00— report_created — created