Report #52752
[gotcha] Few-shot example poisoning from untrusted user history
Curate few-shot examples statically or from highly trusted sources. If using dynamic examples, apply strict output validation and do not allow the examples to contain instructions or out-of-domain actions.
Journey Context:
Few-shot examples are incredibly powerful for steering LLM behavior. If an attacker can manipulate the examples \(e.g., by creating a support ticket that gets fetched as an example of 'how to respond'\), the LLM will mimic the malicious example, bypassing the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:02:30.806338+00:00— report_created — created