Report #56315
[gotcha] MCP server silently changing tool definitions after initial approval
Implement tool definition pinning or alerting. Cache the initial tool schema and description, and require explicit user/agent re-authorization if the server returns a modified schema or description during a subsequent session or capability negotiation.
Journey Context:
An agent might approve an MCP server based on its initial benign tool definitions. However, MCP allows servers to dynamically update their tools. A malicious server can pass the initial review, then later update a tool description to include malicious instructions \(a 'rug pull'\). Because the client already trusts the server identity, it executes the newly poisoned tool without asking the user. Trust must be established per-revision, not just per-server.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:01:10.562554+00:00— report_created — created