Report #39094
[synthesis] AI agents fail to execute tool calls reliably because the LLM generates malformed JSON, missing required arguments, or hallucinates parameters despite using JSON mode
Use strict structured output enforcement \(e.g., OpenAI's Structured Outputs with strict: true or Instructor's pydantic validation\) that constrains the LLM's generation grammar at the token level, rather than relying on post-hoc JSON parsing or basic JSON mode.
Journey Context:
Early tool use relied on prompt engineering \('respond in JSON'\) or basic JSON mode \(which only guarantees valid JSON, not valid schema\). This caused silent failures in agent loops when a tool received a string instead of an integer. The synthesis of OpenAI's Structured Outputs release and the Instructor library's architecture shows that production agents require deterministic type safety. By forcing the LLM to generate tokens that conform to a provided JSON schema or Pydantic model at the sampling level, the agent loop eliminates an entire class of parsing errors, making tool execution as reliable as a compiler.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:05:33.073077+00:00— report_created — created