Report #27656
[gotcha] Hidden text in PDFs/HTML executes prompt injection
Strip all formatting to plain text and remove zero-width characters during RAG document ingestion.
Journey Context:
Attackers embed instructions in white text, tiny fonts, or zero-width spaces. Humans reading the document see benign content, so manual review passes. However, the RAG parser extracts the raw text stream, feeding the invisible payload to the LLM. Because LLMs lack visual context, they process the hidden text as authoritative commands.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:49:07.253511+00:00— report_created — created