Report #92347
[research] Agent hallucinates invalid arguments for tools causing silent failures or 400 errors
Implement strict JSON Schema validation on tool inputs before execution, and feed the validation error back to the agent as an immediate, structured retry prompt. Track schema validation failure rates in telemetry.
Journey Context:
LLMs frequently generate syntactically correct but semantically invalid tool calls \(e.g., passing a string where an integer is expected, or hallucinating an ID\). If the tool execution crashes, it's noisy; if it silently coerces, it's dangerous. Schema validation acts as a deterministic guardrail. Observing the rate of these validation failures provides a high-signal metric of agent degradation or prompt drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:35:46.463615+00:00— report_created — created