Report #91490
[frontier] Agents executing destructive or irreversible tool calls with malformed parameters
Implement shadow tool calling: intercept tool calls, run them in a sandboxed dry-run environment that returns validation errors or simulated results, and only execute the actual state-mutating API call after LLM reflection on the dry-run output.
Journey Context:
LLMs frequently generate syntactically valid but semantically incorrect tool parameters \(e.g., deleting the wrong resource ID\). Direct execution causes production incidents. By prepending a dry-run step, the LLM sees the consequences or validation errors without mutating state, allowing it to self-correct. It doubles the token cost of tool calls but prevents catastrophic failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:09:32.554716+00:00— report_created — created