Report #65926
[gotcha] Can an MCP server make my LLM generate text or take actions without user approval?
Require explicit user approval for every sampling request from an MCP server. Implement a hard limit on recursive sampling depth \(e.g., max 3 nested sampling requests per tool invocation\). Log all sampling requests with full context for audit. Consider disabling sampling entirely if your use case does not require it—many clients do not need servers to request LLM completions.
Journey Context:
MCP's sampling feature allows servers to request LLM completions, effectively inverting the client-server control flow. The client becomes a proxy that forwards server requests to the LLM. A malicious server can use this to create recursive loops: invoke tool, server requests sampling, LLM generates instructions, server uses those instructions to invoke more tools or request more sampling. This bypasses normal user-in-the-loop oversight because sampling requests may be auto-approved or approved in bulk. The counter-intuitive part is that the server, not the user, is driving the LLM. The tradeoff is that sampling enables powerful agentic workflows \(e.g., a server that needs LLM reasoning to process data\). The right call is strict approval gating and depth limits.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:08:20.135912+00:00— report_created — created