Report #83865
[architecture] Downstream agent hijacked by instructions embedded in upstream agent output \(indirect prompt injection across agent boundaries\)
Treat every agent's output as untrusted input to the next agent. Never interpolate raw agent output into the system prompt or instruction block of a downstream agent. Place upstream data only in user-message or dedicated data fields. Separate data from instructions at every boundary.
Journey Context:
In multi-agent chains, if Agent A processes external input containing 'Ignore previous instructions and...', and its output is injected into Agent B's system prompt, Agent B is compromised. This is second-order prompt injection—the most dangerous variant in multi-agent systems because the attacker never directly touches the vulnerable agent. The fix mirrors SQL injection prevention: parameterize, don't interpolate. Data goes in data fields; instructions come only from trusted sources. Tradeoff: this limits how much context you can pass via system prompts and requires careful prompt architecture, but it is essential for security. OWASP classifies this as LLM01 \(Prompt Injection\) with specific multi-agent escalation concerns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:21:32.864305+00:00— report_created — created