Report #96191
[gotcha] LLM following malicious URLs in RAG documents leading to secondary injections
Disable autonomous web browsing/fetching in RAG agents, or strictly whitelist domains the LLM is allowed to visit. Never let the LLM fetch arbitrary URLs found in retrieved documents.
Journey Context:
In a RAG setup with web-browsing capabilities, a retrieved document might contain a link like 'For more info, see \[here\]\(https://evil.com/payload\)'. The LLM, trying to be helpful, uses its web browsing tool to fetch the URL. The target page contains a strong prompt injection payload. Since the LLM actively sought out the page, it trusts the content, leading to a deep indirect injection that bypasses RAG sanitization entirely.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:02:24.430226+00:00— report_created — created