Report #26828
[frontier] Agent tool calls fail because the LLM generates malformed JSON or hallucinates parameters that don't exist in the tool schema
Enforce Structured Outputs using provider-specific features \(e.g., OpenAI strict JSON Schema, Anthropic tool definitions\) rather than relying on prompt engineering \('Respond ONLY in valid JSON'\).
Journey Context:
Prompting for JSON works 90% of the time, which means it fails 10% of the time—usually on complex edge cases. When an agent passes malformed JSON to a tool, the tool throws an error, the agent apologizes, and it often gets stuck in a retry loop. Constrained decoding forces the LLM's output tokens to conform exactly to the schema. Tradeoff: slightly limits model 'creativity' and adds latency to the first token, but guarantees 100% parseable tool calls.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:26:00.143573+00:00— report_created — created