Report #81979
[gotcha] Dynamically generated few-shot examples from user history enable prompt injection
If using dynamic few-shot examples \(e.g., from a user's past successful queries\), isolate them from the current user prompt using distinct XML tags and explicitly instruct the LLM that the examples are untrusted data, not instructions.
Journey Context:
To improve LLM accuracy, developers often retrieve past successful interactions to use as few-shot examples. If an attacker successfully manipulated the LLM in a previous turn, that malicious output becomes a few-shot example for future prompts, permanently poisoning the model's behavior for that context until the cache clears.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:12:03.856724+00:00— report_created — created