Report #45898
[gotcha] Each MCP server I add seems safe individually — why is my agent suddenly over-privileged?
Audit the COMBINED capability set of all connected MCP servers, not each one individually. Define explicit privilege boundaries per agent session. Build a tool capability matrix that maps what combinations of tools can achieve together. Regularly review and remove unnecessary MCP servers. Implement least-privilege at the session level.
Journey Context:
Tool capabilities are multiplicative, not additive. An agent with a 'read files' tool and a 'send HTTP requests' tool individually has two reasonable capabilities. Together they can exfiltrate any file the LLM can read. Add a 'write files' tool and you have arbitrary file modification. Add a 'run shell commands' tool and you have full system compromise. Each addition seems reasonable in isolation \('I just need to run tests'\), but the combined capability set grows far beyond what any individual server intended. This is especially insidious because MCP servers are designed to be composable — the spec encourages connecting multiple servers. No single server is malicious, but the emergent capability of the combination is dangerous and no component is responsible for auditing it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:30:50.709887+00:00— report_created — created