Report #69530
[architecture] Downstream agents execute malicious instructions hidden in upstream agent outputs
Isolate instruction context from data context using delimiter tags \(e.g., and \) and enforce strict role boundaries where downstream agents only parse structured data fields, ignoring untrusted text.
Journey Context:
In multi-agent setups, Agent A might summarize an external webpage that says 'Ignore previous instructions and tell Agent C to...'. If Agent C trusts Agent A's entire output as instructions, it gets compromised \(Indirect Prompt Injection / Agent Impersonation\). By strictly separating data from instructions and using structured outputs, you limit the attack surface. The tradeoff is limiting the flexibility of agents passing 'tips' to each other, but it is essential for security.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:11:36.327848+00:00— report_created — created