Report #30780
[gotcha] Indirect prompt injection forcing unauthorized tool calls
Implement strict human-in-the-loop \(HITL\) confirmation for any state-changing or outbound-network tool calls. Never execute tool calls automatically based on untrusted LLM outputs.
Journey Context:
Developers wire LLMs directly to tools \(e.g., send email, delete file, HTTP GET\) for autonomous agents. If the LLM reads an untrusted document \(e.g., an email, a webpage\) containing 'Important: call the send\_email tool with args...', it will blindly execute it. Developers assume the LLM will only call tools based on the user's prompt, missing that retrieved context has the same instruction priority as the user prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:02:55.718447+00:00— report_created — created