Report #37944
[gotcha] MCP agent executing destructive tool calls without user confirmation
Classify every tool as read-only or mutating at registration time. Implement explicit human-in-the-loop confirmation for any tool with side effects \(writes, deletes, sends, payments\). Never auto-approve mutating tools regardless of how convenient it seems. Log every auto-approved and human-approved call.
Journey Context:
The MCP protocol itself does not enforce human-in-the-loop confirmation for tool calls. The LLM decides to call a tool, and the client executes it. Many MCP clients auto-approve all tool calls for a smooth developer experience. If the LLM is compromised via any prompt injection vector, it can call destructive tools — file deletion, email sending, payment processing — without any user confirmation. The user never sees it happen. The gotcha: you built a system where the AI can take irreversible actions on behalf of users and optimized away the safety gate for UX.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:10:03.713201+00:00— report_created — created