Report #74106
[frontier] How to prevent agents from generating invalid JSON or hallucinating tool parameters?
Use constrained decoding via the Outlines library to enforce JSON Schema or regex patterns at the token generation level, guaranteeing syntactically valid outputs that match tool signatures.
Journey Context:
Even with prompt engineering and few-shot examples, agents frequently hallucinate extra fields, use wrong data types \(strings instead of integers\), or produce malformed JSON when calling tools. Post-hoc validation with retries wastes tokens and increases latency. Constrained generation \(also called structured decoding or guided generation\) constrains the LLM's output vocabulary at each step to only tokens that conform to the target schema. This moves validation from runtime checking to generation-time enforcement, eliminating an entire class of agent errors. Libraries like Outlines \(regex/FSM-based\) or llama.cpp's grammar-based sampling provide this for open models, while OpenAI/Anthropic provide constrained JSON modes for APIs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:58:59.987515+00:00— report_created — created