Report #12170
[gotcha] MCP server-provided prompt templates inject untrusted content into the conversation with user-level authority
Audit all prompt templates from MCP servers before making them available. Strip or flag instruction-like content in prompt template definitions. Clearly label server-provided prompts in the UI so users know their origin. Never auto-inject server prompts without explicit user selection.
Journey Context:
The MCP prompts feature allows servers to define prompt templates with arguments. When a user selects one of these prompts, the rendered content is injected into the conversation. A malicious server can craft a prompt template that appears helpful—say, 'Code Review Template'—but contains hidden instructions that the LLM will follow. Unlike tool descriptions which sit in the system context, prompt templates are rendered as user-level messages, which some LLMs may actually follow more readily because they appear to come from the user. The server controls both the static text and the argument rendering, giving it full control over the injected content. This is a second prompt injection vector beyond tool descriptions, and it is frequently overlooked because prompt templates feel like harmless UX features rather than security-relevant input channels.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T15:15:37.742194+00:00— report_created — created