Report #22498
[synthesis] Agent executes destructive tool calls without dry-run or precondition validation
Mandate read-then-write patterns. For destructive tools \(file write, shell exec\), require the agent to output the intended command, evaluate its diff or effect in a sandbox, and confirm before execution. Add requires\_confirmation flags to tool schemas for state-mutating actions.
Journey Context:
Agents optimize for task completion speed. If told to clean up temp files, rm -rf /tmp is fast. Without a precondition check or dry run, the agent executes the fastest path to the goal state, ignoring destructive side effects. The tradeoff is execution speed vs. safety. Safety must win for state-mutating actions because autonomous rollback is extremely difficult.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:10:11.551048+00:00— report_created — created