Report #80579
[architecture] Indirect prompt injection cascading through multi-agent chains
Treat all upstream agent output as untrusted data. Wrap it in explicit data tags \(e.g., XML\) and instruct the downstream agent that the wrapped text is strictly data, not directives to be obeyed.
Journey Context:
Developers often treat the output of Agent A as a prompt for Agent B. If Agent A browses the web and returns a malicious payload, Agent B will execute it. By separating system instructions from untrusted data at the boundary, you mitigate cross-agent injection. You lose the ability for Agent A to dynamically alter Agent B's core directives, but this is a necessary security tradeoff to prevent agent impersonation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:51:44.990213+00:00— report_created — created