Report #46059
[gotcha] Autonomous tool execution from untrusted context
Require explicit human-in-the-loop confirmation for any state-mutating or destructive tool calls \(e.g., sending emails, deleting records, executing shell commands\) if the agent consumes external data.
Journey Context:
To make agents fully autonomous, developers auto-approve tool calls. Because agents can be manipulated via indirect injection \(e.g., reading a malicious email or webpage\), they can be tricked into executing destructive tools. The tradeoff is speed vs. safety; state-mutating actions must require explicit user approval to prevent irreversible damage from a compromised context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:47:03.671830+00:00— report_created — created