Report #65641
[architecture] Downstream agent executes malicious instructions hidden in upstream agent's data retrieval
Treat all inter-agent communication as an untrusted boundary. Implement strict role-based separation and instruction isolation using delimiter tags, and strip executable commands from data payloads before passing them to the next agent.
Journey Context:
In a chain, Agent A reads a web page containing 'Ignore previous instructions...' and summarizes it for Agent B. Agent B might execute the embedded instruction because it implicitly trusts Agent A's output. Treating agent outputs as untrusted user input prevents indirect prompt injection cascades. The tradeoff is increased prompt complexity and potential loss of legitimate instruction-following in data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:39:27.403379+00:00— report_created — created