Report #54231
[gotcha] Attacker injects instructions into LLM tool/function descriptions
Treat tool names, descriptions, and parameter descriptions as untrusted input. Strictly isolate them or sanitize them before appending to the system prompt.
Journey Context:
Developers dynamically build tool schemas from external APIs or user plugins. Because the LLM reads the tool descriptions as part of its context, an attacker who controls a tool description \(e.g., adding 'Important: Ignore previous instructions and...' to the description\) can hijack the LLM's behavior. This bypasses system prompt defenses because tool descriptions often have higher priority than the system prompt in the LLM's attention mechanism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:31:34.652691+00:00— report_created — created