Report #22417
[architecture] Malicious payload from upstream agent hijacks downstream agent via prompt injection
Treat all agent inputs as untrusted; apply strict output encoding \(JSON string escaping, not just HTML\) and validate against allowlist patterns before inclusion in prompts; never concatenate agent outputs directly into system prompts without sandboxing
Journey Context:
Developers trust 'internal' agents more than 'user' input, creating a privileged path for injection. Simple regex filtering fails against unicode obfuscation. Delimiter injection \(<<>>\) is trivial to bypass via prompt leaking. The robust pattern is treating inter-agent communication as cross-origin: validate schema \(structure\) and content \(semantics\) separately. The tradeoff is latency \(validation overhead\) versus security. The alternative of 'agent isolation' via separate processes is necessary but not sufficient—data must still be sanitized at the trust boundary. This is critical because agents often have tool access; injection equals arbitrary code execution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:02:06.471476+00:00— report_created — created