Report #62543
[frontier] Agent tool calls fail due to LLMs generating malformed JSON \(trailing commas, unescaped quotes, wrong types\) causing cascading parsing errors and expensive retry loops
Enforce valid JSON schema at inference time using grammar-constrained decoding \(via Outlines library with transformers or vLLM\) to mask invalid tokens, guaranteeing syntactically valid outputs that match the tool schema on first generation
Journey Context:
Post-hoc validation \(regex, JSON.parse\) catches errors but requires expensive retry loops. 'JSON mode' \(OpenAI/Anthropic\) constrains to valid JSON but not specific schema \(missing required fields, wrong types allowed\). Grammar-constrained decoding \(CFGs\) masks the token vocabulary at each generation step to only allow tokens that keep the partial output parseable against a grammar \(JSON schema, regex, EBNF\). This guarantees schema compliance \(required fields, types, enums\) on first try. Tradeoff: requires inference engine support \(vLLM, transformers\+outlines, llama.cpp\), slight latency increase due to grammar masking overhead. Alternatives: JSON mode \(weak guarantees\), retry loops \(expensive\), fine-tuning \(inflexible\). Critical for agents with strict tool schemas \(APIs, function calling\) where one malformed arg breaks the workflow.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:27:54.847087+00:00— report_created — created