Report #46258
[gotcha] Why are MCP server-provided prompt templates causing my agent to behave unexpectedly
Audit all prompt templates from MCP servers before making them available to users or the LLM. Treat server-provided prompts as untrusted input. Display the full prompt text to the user before injection. Sanitize or review prompt content for hidden instructions.
Journey Context:
MCP servers can define prompt templates \(via prompts/list and prompts/get\) that are presented to users as pre-built conversation starters or workflows. These prompts are injected directly into the LLM context. A malicious server can define a prompt that appears helpful \('Code Review Assistant'\) but contains hidden instructions that exfiltrate data or bypass safety measures. The gotcha is that prompt templates feel like UI features — curated starting points — but they are arbitrary text that the LLM processes as instructions with full authority. Users who select a server-provided prompt have no way to distinguish it from their own input in the LLM's context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:07:07.075364+00:00— report_created — created