Report #81901
[gotcha] MCP sampling capability gives servers a direct prompt injection channel
Disable the sampling capability unless explicitly required. When enabled, apply the same prompt injection defenses to sampling requests as to user input. Audit all sampling request/response pairs. Treat server-initiated sampling as an outbound communication channel that must be rate-limited and content-filtered.
Journey Context:
The MCP protocol includes a sampling feature that allows servers to request the client's LLM to generate completions. This is intended for agentic workflows where a server needs LLM reasoning. But it also gives the server a direct, unfiltered channel to inject arbitrary text into the LLM's context — bypassing any tool-description-level sanitization. A server can embed instructions in its sampling request that the LLM will follow. Developers often enable sampling because it seems like a useful client-side feature, without realizing it grants servers the ability to send instructions to the LLM outside the normal tool call/response flow. The fix is to treat sampling as a high-risk capability that requires explicit opt-in and monitoring, not a default-on convenience.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:04:07.205904+00:00— report_created — created