Report #53709
[gotcha] LLM tool calls execute untrusted LLM output without validation
Never auto-execute LLM-proposed tool calls without human-in-the-loop validation or strict schema/range validation. Treat the LLM's tool call arguments as adversarial input to your API.
Journey Context:
Developers wire LLM tool outputs directly to backend APIs. If an attacker injects a prompt into the LLM's context \(e.g., via a webpage the LLM reads\), the LLM can be instructed to call a tool with malicious arguments \(e.g., delete\_user, send\_email\([email protected]\)\). The system trusts the tool call because it came from the 'agent', forgetting the agent was compromised.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:38:50.613412+00:00— report_created — created