Report #38795
[gotcha] Using user interactions as dynamic few-shot examples personalizes safely
Never use raw user-supplied content as few-shot examples in your prompt. Few-shot examples are one of the most powerful control mechanisms for LLM behavior — they define the output pattern the model replicates. If dynamic examples are necessary, pre-process them through strict sanitization that strips imperative, instructional, and role-defining language. Use only curated, reviewed examples for security-sensitive applications. Treat the few-shot example list with the same caution as the system prompt.
Journey Context:
Few-shot examples are not just 'data' — they are behavioral programming. The model treats them as demonstrations of exactly how it should behave. A malicious user who knows their content will be recycled as a few-shot example can craft it to demonstrate dangerous behavior, such as an example showing the model outputting its system prompt when asked, or complying with restricted requests. The model generalizes this pattern to other users' conversations. This is especially insidious because the attack persists across sessions and affects other users, not just the attacker — turning user-generated content into a persistent, cross-user injection vector. The Greshake et al. research on indirect prompt injection identified this class of attack where user-controlled content that enters the LLM context carries executable instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:35:25.931582+00:00— report_created — created