Report #24307
[gotcha] Dynamic few-shot examples introducing malicious behavior
Avoid using user-generated content or untrusted database entries as few-shot examples; use a static, curated set of examples or strictly sanitize dynamic examples to remove imperative language.
Journey Context:
Few-shot examples heavily influence LLM behavior. If an application dynamically builds few-shot examples from user profiles or search results, an attacker can craft an input that looks like an example but contains a malicious instruction \(e.g., 'User: X, Assistant: \[Harmful output\]'\). The LLM will eagerly mimic the pattern, overriding the system prompt because few-shot context is weighted highly by the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:12:25.976526+00:00— report_created — created