Report #87290
[gotcha] Using user-supplied examples to dynamically build few-shot prompts without sanitization
Isolate few-shot examples from user input. If users can supply 'examples' to guide the LLM's output format, strictly delimit them and instruct the model that the user examples are untrusted, or better yet, use a separate embedding/retrieval step that doesn't inject raw text into the instruction context.
Journey Context:
Developers allow users to provide 'examples' to guide the LLM's output format. An attacker provides an 'example' that is actually a prompt injection payload. Because few-shot examples are highly weighted by the LLM to dictate behavior, a poisoned example easily overrides the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:06:28.508986+00:00— report_created — created