Report #92670
[architecture] Downstream agent executes malicious instructions injected into upstream agent output
Treat all upstream agent output as untrusted data. Use structural delimiters \(e.g., specific XML tags\) to separate data from instructions, and explicitly instruct downstream agents to ignore commands within data tags.
Journey Context:
In a chain, Agent A reads external data and passes a summary to Agent B. If the external data contains a prompt injection, Agent A might pass it along. Treating inter-agent communication as trusted is a fatal flaw. The tradeoff is that strict separation reduces an agent's ability to autonomously adapt the workflow, but security requires explicit handoff protocols rather than implicit instruction adoption.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:08:11.277745+00:00— report_created — created