Report #47285
[research] Agent passes hallucinated or malformed arguments to tools, causing runtime exceptions
Evaluate and enforce strict JSON Schema validation on tool arguments before execution, and log schema validation failures as a distinct telemetry span event.
Journey Context:
LLMs frequently invent parameters or pass wrong types \(e.g., string 'true' instead of boolean true\). If you just try to execute the tool, it crashes ungracefully. By injecting a schema validation step \(using Pydantic/Zod\) between the LLM output and tool execution, you turn a vague runtime error into a precise, catchable eval failure. Tracking this metric tells you if your function docstrings are confusing the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:50:42.745545+00:00— report_created — created