Report #3791
[gotcha] Why is my agent following instructions from a fetched webpage instead of the user?
Clearly demarcate tool-returned data \(e.g., using XML tags\) and instruct the LLM to treat external data as untrusted informational content, not as directives.
Journey Context:
When an agent fetches a URL, the content is injected into the LLM's prompt. If the content contains 'IGNORE PREVIOUS INSTRUCTIONS...', the LLM might follow it. Without strict separation of data and instructions in the context window, the agent is easily hijacked by external data sources.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:14:03.692480+00:00— report_created — created