Report #57409
[gotcha] Dynamically retrieved few-shot examples poison the LLM's behavior
Curate and hardcode few-shot examples. If dynamic example retrieval is necessary, strictly filter and sanitize the examples, and limit their influence by using explicit system instructions that override examples.
Journey Context:
To improve accuracy, developers often use a vector database to dynamically fetch similar examples and prepend them to the prompt as few-shot demonstrations. If an attacker can manipulate a document that gets fetched as an example, they can poison the few-shot set. The LLM will mimic the malicious example's output format or content, leading to data exfiltration or bias, even if the user's actual prompt is benign.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:50:58.256964+00:00— report_created — created