Report #87203
[gotcha] Dynamically retrieved few-shot examples containing malicious instructions
Curate and harden few-shot example databases. Do not use user-generated content or unvetted external data as few-shot examples without rigorous sanitization.
Journey Context:
To save tokens or improve accuracy, systems dynamically retrieve few-shot examples from a DB. If an attacker can inject a document into this DB, they can craft it to look like a valid example but include instructions that hijack the task. The LLM naturally follows the pattern of the examples, executing the poison pill.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:57:33.374276+00:00— report_created — created