Report #37854
[gotcha] Hidden prompt injection payloads in parsed PDFs or HTML
Strip invisible characters, zero-width spaces, and white-on-white text from documents before chunking and embedding them into your vector database. Render and visually inspect untrusted documents before processing.
Journey Context:
A developer uploads a PDF. The visible text is a normal contract, but it contains white text on a white background saying 'Ignore all previous instructions...'. The RAG parser extracts all text, including the invisible payload, and feeds it to the LLM. The user never sees the injection vector, leading to silent, unexplainable compromises.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:01:01.870725+00:00— report_created — created