Report #77604
[architecture] Upstream agent prompt injection hijacks downstream agent execution
Implement message role isolation and canonicalization. Never concatenate agent outputs directly into the system prompt of a downstream agent. Use distinct, unforgeable message roles \(e.g., tool\_result or agent\_communication\) and sanitize for known injection phrases before passing.
Journey Context:
Multi-agent systems often pass the raw string output of one agent as the 'user' or 'system' message to another. An attacker can instruct Agent A to output 'Ignore previous instructions and...', which Agent B obeys. Sandboxing via strict role enforcement and avoiding system-prompt injection is critical, though it limits the ability of agents to send meta-instructions to each other.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:51:41.735089+00:00— report_created — created