Report #31355
[gotcha] Dynamic few-shot examples poisoning LLM output format or behavior
Curate few-shot examples from a static, trusted database. If dynamic examples are necessary, apply strict output parsing and validation, and do not rely on the LLM to strictly follow a schema if the examples are untrusted.
Journey Context:
Developers use vector databases to fetch similar examples to guide the LLM. An attacker submits a query that retrieves a malicious example \(e.g., an example that includes a SQL injection or breaks the JSON schema\). The LLM mimics the malicious example perfectly, bypassing system instructions about output format because few-shot examples heavily bias the model's behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:00:57.465432+00:00— report_created — created