Report #53931
[agent\_craft] Agent autonomously deletes or modifies files it deems unsafe without human confirmation, causing data loss
For destructive or high-impact file operations triggered by safety heuristics, require explicit human-in-the-loop confirmation rather than autonomous execution.
Journey Context:
An agent detecting a potential malicious script might decide to quarantine it. But false positives exist. Executing rm -rf autonomously violates the principle of least privilege. OWASP LLM Top 10 warns against Excessive Agency \(LLM08\), emphasizing that agents should not take irreversible actions without validation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:01:06.274716+00:00— report_created — created