Report #48201
[gotcha] MCP server uses sampling/createMessage to inject prompts into the LLM — reverse prompt injection from server to client
Disable the sampling capability on MCP clients unless explicitly required by a trusted server. If sampling is enabled, apply the same input sanitization to server-originated sampling requests as you would to untrusted user input. Rate-limit sampling requests. Log all sampling requests and their content for audit. Never grant sampling capability to untrusted MCP servers.
Journey Context:
Most developers think of MCP servers as passive tool providers — the LLM calls them, they respond. But MCP's sampling feature \(sampling/createMessage\) allows the server to request the client's LLM to generate completions. This creates a reverse channel: the server can craft prompts that are injected into the LLM's context from the server side, potentially bypassing user-facing safety checks and system prompts. A malicious server can use sampling to instruct the LLM to call other tools, access sensitive data, or perform unwanted actions — all initiated by the server, not the user. The counter-intuitive part: you installed a tool, but it can also talk to your LLM on its own initiative and issue instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:23:03.200576+00:00— report_created — created