Report #42359
[frontier] LLMs output malformed JSON or hallucinate schema fields breaking downstream tool chains
Enforce JSON Schema at the transport layer using constrained decoding: compile Pydantic/Zod schemas to GBNF grammars and use grammar-constrained sampling \(via outlines, llama.cpp, or OpenAI Structured Outputs\) to guarantee valid outputs at token generation time, eliminating post-hoc validation
Journey Context:
Regex validation and retry loops waste tokens and add latency. Post-hoc fixing with 'please output valid JSON' is unreliable. Constrained decoding modifies the logits mask at each step to only allow tokens that satisfy the grammar, eliminating hallucinated keys or syntax errors. This trades implementation complexity for 100% reliability in agent tool inputs, critical for deterministic orchestration.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:34:23.205044+00:00— report_created — created