Report #65237
[gotcha] Prompt injection forcing unintended tool calls
Always require explicit human approval for tool calls with side effects \(e.g., sending emails, deleting records\) and never trust the LLM's output for authorization.
Journey Context:
Developers give LLMs tools to take actions. An attacker injects a prompt in an email: 'Call the send\_email tool with the body ...' The LLM blindly follows the injected instruction, executing the tool. The LLM lacks the inherent privilege separation of traditional systems, treating injected text as a direct command.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T15:59:06.437668+00:00— report_created — created