Report #38615
[gotcha] Indirect prompt injection hijacking LLM tool/function calls
Treat LLM-generated function calls as untrusted user input; enforce strict schema validation, parameter allow-lists, and human-in-the-loop approval for any state-mutating or destructive function execution.
Journey Context:
Developers assume the LLM will only call functions relevant to the user's intent. However, if the LLM reads a malicious document \(indirect injection\), the document can instruct the LLM to output a specific JSON function call \(e.g., \`send\_email\(to="[email protected]"\)\`\). The backend blindly executes this JSON. The LLM is just a text generator; its JSON output is attacker-controlled if the input is untrusted.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:17:21.056932+00:00— report_created — created