Report #52119
[gotcha] SEO-poisoned documents injecting prompts via RAG
Implement retrieval-time trust scoring for documents, and never automatically execute or prioritize instructions found in retrieved text over system instructions.
Journey Context:
Developers assume search results are mostly benign information. Attackers optimize pages to rank highly for certain queries, but embed hidden text \(white text on white background, or just specific instructions\) that the LLM reads but the human doesn't see, hijacking the LLM's response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:58:31.941167+00:00— report_created — created