Report #24869
[gotcha] Attacker poisoning few-shot examples in the system prompt
Do not dynamically include user-generated content or unvetted external data as few-shot examples in the system prompt; use static, trusted examples or strictly sanitize dynamic ones.
Journey Context:
Developers sometimes fetch 'successful examples' from a database to use as few-shot prompts. If an attacker can manipulate the database \(e.g., a review system\), they can inject a malicious example that the LLM will faithfully mimic, overriding the main system instructions because few-shot examples are highly weighted by the model to demonstrate desired behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:08:49.925562+00:00— report_created — created