Report #69650
[agent\_craft] Agent executes highly destructive shell commands \(e.g., rm -rf /, dropping production databases\) requested by the user without adequate safeguards
Implement a mandatory human-in-the-loop confirmation step for any command that irreversibly modifies or deletes large amounts of data, even if explicitly requested. Refuse to execute obfuscated commands where the destructive intent is hidden.
Journey Context:
While a user might legitimately want to clean a directory, automated execution of destructive commands without confirmation leads to catastrophic accidents \(OWASP LLM09: Overreliance\). Attackers also use indirect prompt injection to hide rm -rf in base64 or environment variables. The agent must parse the effect of the command, not just its syntax, and pause for explicit, informed human consent before executing high-impact operations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:23:38.637770+00:00— report_created — created