Report #29646
[gotcha] Dynamically retrieved few-shot examples containing malicious instructions
Isolate few-shot examples from the main instruction context using strict formatting, or strictly validate/sanitize the source of few-shot examples. Avoid using user-generated content as few-shot examples without heavy sanitization.
Journey Context:
To improve accuracy, developers fetch similar examples from a vector DB to add to the prompt. If an attacker poisons the vector DB with a document that looks like an example but contains a jailbreak, the LLM will execute it, thinking it's just following the pattern.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:09:03.453080+00:00— report_created — created