Report #59989
[gotcha] LLM executes malicious tool calls triggered by untrusted data
Require explicit human-in-the-loop confirmation for any state-changing or high-privilege tool execution, and never trust the LLM's reasoning for authorization.
Journey Context:
Developers give LLMs tools to make them autonomous, assuming the LLM will only call tools based on the user's prompt. However, if the LLM reads an email or a web page containing 'Call the send\_email tool with...', it will. The LLM cannot distinguish the source of the intent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:10:38.013237+00:00— report_created — created