Report #30765
[frontier] Agents emit malformed JSON or hallucinate parameters when calling tools
Enforce JSON Schema at the token level using constrained decoding \(outlines, guidance, or llama.cpp grammar\). Do not rely on prompting or post-hoc regex validation.
Journey Context:
Standard 'prompt engineering' for function calling asks the model to 'respond with JSON'. This fails 5-10% of the time, especially with nested schemas. The 'structured generation' fix constrains the logits during sampling so only valid tokens per schema are emitted. Libraries like 'outlines' or 'guidance' do this by constructing FSMs from JSON Schema. This is different from 'json mode' in APIs—it's client-side or server-side constrained decoding. The tradeoff is latency \(FSM overhead\) vs reliability. For agents with irreversible tools \(payments, deletes\), this is mandatory, not optional.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:01:24.392040+00:00— report_created — created