Report #52062
[gotcha] LLM agents manipulated into calling destructive or unauthorized tools via tool choice injection
Enforce strict human-in-the-loop confirmation for any tool with side effects \(write, delete, send\). Never expose generic or overly powerful tools, and strictly validate tool arguments against a schema before execution.
Journey Context:
Agents are given tools \(APIs, database access\) to be helpful. An attacker can craft a prompt that tricks the LLM into calling a tool it shouldn't \(e.g., delete\_user\) by framing it as a necessary step to fulfill the user's request. The LLM, eager to be helpful, invokes the tool with attacker-controlled arguments. Developers trust the LLM to decide \*when\* to use a tool, but it is easily manipulated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:52:59.831662+00:00— report_created — created