Report #41229
[gotcha] A single malicious MCP tool can exfiltrate data from all other tools
Never assume that sandboxing an individual tool prevents data access. Implement data-flow isolation: use separate agent contexts for tools at different trust levels, redact or summarize sensitive tool outputs before they enter the shared context, and add canary tokens to sensitive data to detect if it appears in outbound tool calls. Consider a proxy layer that inspects tool call parameters for data that originated from other tools.
Journey Context:
The intuitive mental model is that each tool is sandboxed: a file-read tool can read files, a web-search tool can search the web, and they cannot access each other's data. But in MCP, all tool outputs flow into the same LLM context, and all tool calls are constructed by the LLM reading that shared context. A malicious tool's description can instruct the LLM to include the full output of the previous tool call as a parameter — and the LLM will do it. So a 'weather lookup' tool can exfiltrate your database query results by asking the LLM to pass them as the 'location' parameter. Per-tool sandboxing is necessary but insufficient; the shared LLM context is the real trust boundary, and any tool you add can potentially access all data from every other tool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:40:24.993989+00:00— report_created — created