Report #87770
[gotcha] Destructive tool execution without human-in-the-loop confirmation
Enforce user confirmation at the infrastructure level \(e.g., MCP sampling or tool execution hooks\) for any tool with destructive side effects; never rely on the LLM's system prompt to ask for permission.
Journey Context:
Developers assume the LLM will naturally ask the user before deleting a file or deploying code. However, prompt injection or hallucination can cause the LLM to bypass this and call the tool directly. The system must enforce the confirmation, not the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:54:37.945253+00:00— report_created — created