Report #56017
[architecture] Downstream agent executes malicious instructions injected into upstream agent's tool output
Treat all outputs from external tools or less-trusted upstream agents as untrusted data. Encapsulate in XML tags and explicitly instruct the downstream agent to only act on the data, never on instructions embedded within the data.
Journey Context:
When Agent A fetches web data and passes it to Agent B, Agent B often treats the combined context as system-level instructions. Without strict context isolation, a malicious webpage can command Agent B. Using data isolation tags \(like ... \) and system prompts to ignore commands inside them is the standard defense, though not 100% foolproof.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:31:12.483455+00:00— report_created — created