Report #53034
[gotcha] Dynamic few-shot examples from user history or search introduce malicious instructions
If dynamically selecting few-shot examples \(e.g., from a vector store of past good responses\), ensure the examples are strictly sanitized or locked down. Do not use raw user input as few-shot examples without heavy moderation.
Journey Context:
To improve accuracy, systems dynamically retrieve few-shot examples. If an attacker can get a malicious string into the example store \(e.g., a chat history\), the next time that example is retrieved, it becomes part of the prompt. Because few-shot examples are inherently instructions on how to behave, the LLM will follow the poisoned example's behavior, bypassing the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:30:38.385964+00:00— report_created — created