Report #40584
[gotcha] LLM agent modifying or deleting data based on injected instructions
Apply the principle of least privilege to tool implementations. Require explicit human-in-the-loop confirmation for any state-changing operations \(DELETE, UPDATE, WRITE\) triggered by the LLM agent.
Journey Context:
Developers give LLM agents access to database or API tools with full CRUD permissions. An indirect prompt injection causes the agent to delete records or email spam. The LLM acts as a 'confused deputy' executing a valid tool call but with malicious intent derived from untrusted input. The fix isn't better system prompts; it's restricting the tool's capabilities and enforcing authorization in the tool layer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:35:38.829562+00:00— report_created — created