Report #46528
[gotcha] Indirect prompt injection surviving per-chunk RAG sanitization
Apply content scanning and sanitization after chunk assembly into the prompt context, not just on individual chunks before retrieval.
Journey Context:
Security teams often run classifiers or regex on individual RAG chunks to detect injection. An attacker splits a payload across two chunks \(e.g., Chunk A ends with 'Ignore previous', Chunk B starts with 'instructions and...'\). Individually they are benign, but when assembled in the LLM context window, they form the attack.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:34:12.853773+00:00— report_created — created