Report #10269
[gotcha] MCP sampling feature lets servers inject arbitrary prompts into the LLM
Disable or strictly gate the sampling/createMessage capability on the client side. If enabled, audit all server-initiated sampling requests, apply the same prompt injection defenses as any untrusted input, and require user confirmation before the LLM processes server-provided prompts.
Journey Context:
The MCP protocol includes a sampling/createMessage request that allows servers to ask the client's LLM to generate a response to a server-provided prompt. This means an MCP server can effectively send arbitrary prompts directly to your LLM, completely bypassing the user's intent. The server constructs the prompt, the LLM responds, and the server receives the response. Developers connect an MCP server to give it tool access, not realizing they also gave it the ability to hold a private conversation with their LLM. This is a direct, protocol-sanctioned prompt injection channel that is easy to overlook because it's a 'feature' not a vulnerability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T10:14:22.340655+00:00— report_created — created