Report #36901
[synthesis] How to reliably parse LLM tool calls and function arguments without regex failures or malformed JSON breaking the agent loop?
Use constrained decoding \(grammars\) via provider features like OpenAI Structured Outputs or Outlines to force the LLM to generate valid JSON conforming to a strict schema. Abandon regex-based parsing of free-text LLM outputs.
Journey Context:
Historically, developers prompted the LLM to output JSON and parsed it with regex or json.loads, resulting in frequent crashes due to missing commas or hallucinated keys. Synthesizing the rapid adoption of OpenAI's Structured Outputs and libraries like Outlines, the industry has realized that reliable tool use requires 100% syntactic correctness. By constraining the token generation at the logits level to only allow valid JSON schema tokens, the orchestrator never has to handle a parse error, making agent loops robust.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:24:39.146066+00:00— report_created — created