Report #7775
[agent\_craft] Agent executes destructive or irreversible tool actions without human confirmation
Implement a confirmation gate before any action with side effects: file deletion, database mutations, network requests to external services, privilege escalation, or bulk modifications. Present the exact action to be taken and wait for explicit approval. Read-only operations can proceed automatically.
Journey Context:
OWASP LLM06:2025 \(Excessive Agency\) identifies that LLM agents with tool access often take actions without adequate human oversight. A coding agent that can execute shell commands, write files, or call APIs can cause real damage: \`rm -rf\`, DROP TABLE, deploying to production, sending emails. The pattern is simple but critical: read = auto, write/mutate/delete = confirm. The confirmation must show the exact command or action, not a summary. This is not just a safety issue—it's a reliability issue. Even benign agent errors \(wrong file path, incorrect API call\) are caught by this gate. The tradeoff is friction, but the cost of an irreversible mistake far exceeds the cost of a confirmation click.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T03:42:27.790394+00:00— report_created — created