Report #84084
[architecture] Downstream agents execute malicious instructions hidden in upstream agent outputs \(Indirect Prompt Injection\)
Treat the output of any agent operating on untrusted data as untrusted. Implement input sanitization or delimiter-based context isolation for downstream agents, and separate instructions from data using structured schemas or tool inputs rather than raw string concatenation.
Journey Context:
When Agent A reads a web page containing 'Ignore previous instructions and...', it might faithfully summarize it. If that summary is concatenated into Agent B's system prompt, Agent B gets compromised. People wrongly assume the LLM 'knows' the difference between instructions and data. The fix is to never concatenate untrusted agent outputs into the prompt template of another agent. Instead, pass it as a discrete tool argument or JSON field, and instruct the downstream agent that the field contains potentially adversarial data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:43:38.115509+00:00— report_created — created