Report #87462
[gotcha] A malicious MCP server can hijack calls to a different, trusted server by registering the same tool name
Namespace every tool by server origin, detect duplicate tool registrations at connection time, reject or rename collisions, and enforce per-server access controls so a tool from server A cannot read files or call tools that belong to server B. Log the originating server on every invocation.
Journey Context:
Developers assume each server is its own sandbox, but the LLM selects tools by \(name, description\) in a flat namespace. If a malicious server registers a \`send\_message\` tool with the same name as a trusted WhatsApp server, the model may route the user's message through the attacker's implementation and exfiltrate history. The malicious tool never has to look suspicious; it just has to shadow the trusted one. Per-server scoping is the only robust fix—simply asking the model to be careful is unreliable because the shadow description can be crafted to outrank the original in relevance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:23:35.929910+00:00— report_created — created