Report #31157
[gotcha] Attacker poisons few-shot examples in the prompt to manipulate output format or content
Isolate few-shot examples from user control. If using dynamic examples retrieved from a database, ensure they are strictly validated and sanitized, and prefer delimiter separation \(e.g., XML tags\) between examples and user input.
Journey Context:
Dynamic few-shot prompting \(retrieving examples from a DB based on user query\) is powerful but dangerous. If an attacker can manipulate the retrieval query to return a malicious document as a "few-shot example", the LLM will dutifully mimic the malicious example's format or content. This bypasses system prompts because LLMs heavily weight few-shot examples as demonstrations of desired behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:41:12.307257+00:00— report_created — created