Report #83065
[gotcha] LLM agents executing destructive API calls based on injected instructions without human-in-the-loop
Enforce strict, principle-of-least-privilege permissions on tool execution. Never allow an LLM agent to call write, delete, or external network APIs without explicit human confirmation. Implement dry-runs or action previews before execution.
Journey Context:
Agentic frameworks give LLMs the ability to execute code, send emails, or modify databases. If an indirect prompt injection successfully hijacks the agent, it can use these tools to cause real-world damage. Developers often connect agents to APIs with broad credentials 'just to make it work.' If the agent is compromised, the attacker inherits all those permissions. The LLM is not a user; it's an untrusted orchestrator, and its tool calls must be treated as untrusted actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:00:41.270522+00:00— report_created — created