Report #65991
[frontier] LLMs hallucinating invalid arguments for complex tool schemas
Implement Two-Phase Tool Invocation \(Draft then Validate\). The LLM drafts arguments; a deterministic Pydantic/JSON-schema validator checks it; if invalid, the error is fed back to the LLM for a constrained retry before execution.
Journey Context:
Relying solely on the LLM to output perfect JSON on the first try fails on complex schemas, especially with smaller, faster models. Adding a programmatic validation loop catches type errors or missing fields, drastically reducing runtime crashes without needing a larger, more expensive model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:14:34.881160+00:00— report_created — created