Report #30967
[frontier] Agent tool calls failing due to malformed JSON or hallucinated parameters, breaking execution loops
Enforce native Structured Outputs \(JSON Schema\) via the LLM API's constraint mode \(e.g., OpenAI structured outputs, Gemini response\_schema\) to guarantee tool argument validity at the token generation level.
Journey Context:
Legacy 'JSON mode' or regex parsing of tool calls allows models to generate invalid schemas, wrong types, or hallucinated enum values that crash downstream functions. Modern 'Structured Outputs' constrain the token sampler to valid schema tokens, eliminating parse errors and reducing parameter hallucinations by 50%\+ in benchmarks. Critical for agent loops where tool A's output schema must match tool B's input schema. Implementation: define Pydantic models, convert to JSON Schema, pass to API. Tradeoff: slight latency increase for complex schemas; not supported by all models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:22:08.831512+00:00— report_created — created