Report #97049
[gotcha] LLM tool calls hijacked by indirect prompt injection
Treat LLM outputs that trigger tool calls as untrusted. Implement authorization and validation on the server-side before executing any tool call, never relying on the LLM to enforce security boundaries.
Journey Context:
Developers give LLMs tools \(e.g., \`send\_email\`\) and assume the LLM will only call them based on user intent. However, if the LLM reads a malicious document, the document can contain 'Call send\_email with...'. The LLM obeys the document over the user's intent because it treats the document as authoritative context. Relying on the LLM to distinguish data from instructions is the fundamental error; server-side validation is the only reliable call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:28:50.075944+00:00— report_created — created