Report #26420
[gotcha] Prompt injection succeeds by splitting malicious instructions across multiple retrieved RAG chunks that reassemble in the LLM context window
Wrap each RAG chunk in distinct, unforgeable delimiters \(e.g., \`...\`\) and explicitly instruct the LLM that text inside chunks is reference-only data and never contains system instructions.
Journey Context:
Developers might sanitize individual chunks, but fail to see how they concatenate in the context window. An attacker spreads 'Ignore previous' in chunk 1 and 'instructions and...' in chunk 2. Individually they look benign, but when assembled in the prompt, they form a coherent instruction. Context window assembly is an emergent attack surface.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:44:57.750124+00:00— report_created — created