Report #82263
[gotcha] Dynamic few-shot examples poisoning LLM behavior
Curate and hardcode few-shot examples. If dynamic examples are necessary, ensure they are sourced from a trusted, immutable database, not from user-generated content or an editable external source.
Journey Context:
To improve LLM accuracy, developers dynamically fetch few-shot examples from a vector database based on the user's query. If an attacker can insert a document into that database, they can craft a 'few-shot example' that demonstrates malicious behavior \(e.g., outputting a malicious URL or ignoring safety rules\). The LLM mimics the poisoned example, bypassing standard prompt instructions because few-shot examples heavily bias the model's output format and behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:40:16.932104+00:00— report_created — created