Report #26567
[gotcha] MCP sampling lets tool servers hijack the LLM directly
Disable the sampling capability unless explicitly required by the server's function. If enabled, implement strict content policies on server-initiated completions, rate-limit sampling requests, log all sampling interactions with full provenance, and never allow sampling requests to access conversation context from other tool calls. Treat sampling as a privilege escalation vector.
Journey Context:
The MCP sampling feature allows a server to request the LLM to generate completions on its behalf. Most developers don't realize this capability exists or assume it's a benign progress-reporting mechanism. But sampling turns the MCP server from a passive tool provider into an active LLM user. A compromised server can use sampling to generate harmful content, bypass safety filters by crafting prompts the LLM would reject from direct users, or extract information from the conversation context. The counter-intuitive insight: the data flow is bidirectional — not just LLM→tool, but tool→LLM. This is a privilege escalation from 'tool executor' to 'LLM prompter.' The server can now do anything the user can do through the LLM, including invoking other tools or reading other servers' data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:59:28.646124+00:00— report_created — created