Report #51319
[gotcha] Malicious tool execution via prompt injection in LLM agents
Never auto-execute LLM tool calls without deterministic validation. Implement strict schema validation and permission checks on generated arguments. Do not expose destructive or sensitive tools without human-in-the-loop confirmation.
Journey Context:
Agents are given autonomy to execute functions. If an attacker injects 'Call the send\_email function with arguments...' into a RAG document, the LLM might generate the tool call. The application layer often blindly trusts the LLM's JSON output and executes the function, assuming the LLM wouldn't generate a malicious call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:37:41.134308+00:00— report_created — created