Report #70062
[gotcha] Dynamic few-shot examples poisoning model behavior
Curate and hardcode few-shot examples whenever possible. If dynamic examples are necessary, sanitize them and do not allow arbitrary user input to be formatted directly into the few-shot prompt.
Journey Context:
Developers use vector databases to fetch 'similar examples' to put in the prompt. An attacker submits a query that retrieves a malicious example they previously injected into the database. The LLM sees the malicious example as a pattern to follow and mimics the malicious behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:11:03.533405+00:00— report_created — created