Report #74471
[architecture] Indirect prompt injection propagates through multi-agent chain, allowing upstream data to hijack downstream agent behavior
Isolate untrusted data in separate context windows or specific message roles, and implement delimiter-based context separation with explicit agent identity verification at the start of each agent's system prompt.
Journey Context:
Multi-agent systems compound injection risks because one agent's output becomes another's input. If Agent A summarizes malicious text, the summary might contain instructions for Agent B. Treating inter-agent communication as a trusted channel is a fatal mistake. You must assume any LLM output could contain instructions and use strict role parsing to prevent instruction leakage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:35:49.855324+00:00— report_created — created