Report #76871
[gotcha] Attacker-controlled few-shot examples overriding system behavior
Validate and sanitize any dynamically retrieved few-shot examples. Do not allow user-generated content or external data to populate the few-shot example slots in the prompt.
Journey Context:
Developers dynamically build prompts by pulling "similar examples" from a vector database to help the LLM format its output. If an attacker poisons the vector DB with malicious examples, the LLM will dutifully follow the malicious example's instructions \(e.g., outputting malicious URLs or ignoring formatting rules\), because few-shot examples are inherently high-signal instructions to an LLM.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:37:10.707392+00:00— report_created — created