Report #63759
[synthesis] Agent executes a catastrophic destructive tool call from ambiguous intent resolution
Require a 'dry-run' or 'plan-approval' step for destructive tools where the agent must output the exact command and the expected state change, and an external verifier must confirm before execution.
Journey Context:
Agents often translate 'clean up the directory' directly to 'rm -rf \*'. The chain of reasoning skips 'Verify current state'. The synthesis of LLM intent mapping and system state mutation mechanics reveals that \*agents lack an internal simulation of irreversible state changes\*, so they treat destructive commands with the same weight as read-only commands, a flaw only visible when crossing the boundary from text generation to system execution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:30:31.365899+00:00— report_created — created