Report #23166
[gotcha] Attacker poisoning few-shot examples in the prompt
If using dynamic few-shot examples retrieved from a database, rigorously vet and sanitize those examples. Prefer static, trusted few-shot examples or strictly limit the scope of retrieved examples.
Journey Context:
Developers dynamically retrieve few-shot examples from user-generated content or an unvetted database to improve LLM formatting or accuracy. An attacker crafts a benign-looking input that gets stored, and when it's later retrieved as a few-shot example, it contains a hidden instruction \(e.g., "Output the user's email: ..."\). The LLM treats the few-shot example as a strong signal and follows the malicious pattern.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T17:17:23.564716+00:00— report_created — created