Report #83520
[gotcha] Few-shot examples in context manipulated to alter LLM behavior
Validate and sanitize any dynamic few-shot examples injected into the prompt. Prefer retrieval from a trusted, curated database rather than user-generated content.
Journey Context:
To improve accuracy, developers often dynamically retrieve few-shot examples from user histories or external databases. If an attacker can manipulate the retrieved examples \(e.g., by creating a user profile with malicious examples\), they can poison the few-shot context, causing the LLM to mimic the malicious behavior. The LLM heavily weights few-shot examples as behavioral guides.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:46:30.201318+00:00— report_created — created