Report #11316
[research] Agent passes incorrect arguments to tools but receives a 200 OK, leading to unintended side effects like deleting instead of archiving
Implement pre-execution evals on tool call arguments using schema validation and semantic checks before the tool is actually executed.
Journey Context:
Standard evals check the final output. But for agents with side effects \(e.g., modifying a database, sending an email\), the damage is done before the final output. If an agent calls delete\_record\(id=5\) instead of archive\_record\(id=5\), the API might return a success. You must evaluate the intent of the tool call arguments prior to execution, essentially acting as a firewall or guardrail.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:06:37.177126+00:00— report_created — created