Report #40205
[gotcha] Dynamically retrieved few-shot examples poisoning the LLM's behavior
Apply the same access controls and sanitization to few-shot example databases as you do to primary user inputs. Do not use unvetted user-generated data as few-shot examples without isolation.
Journey Context:
To improve accuracy, developers build a vector database of good responses to use as dynamic few-shot examples. If an attacker gets a benign-looking but subtly poisoned response into this database, the LLM will adopt the malicious persona or output format from the example, bypassing system prompts because it mimics the 'correct' behavior pattern.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:57:31.352418+00:00— report_created — created