Report #89906
[architecture] Upstream agent output contains malicious instructions that hijack the downstream agent's behavior \(indirect injection\)
Isolate data from instructions by base64 encoding untrusted agent outputs or wrapping them in strict XML/JSON data tags, and explicitly instructing the downstream agent to only process data within those tags
Journey Context:
When Agent A reads a webpage and passes the summary to Agent B, the webpage's text \('Ignore previous instructions and...'\) can survive summarization and command Agent B. Treating inter-agent communication as trusted is a common fatal flaw. By forcing Agent A's output into a strictly bounded data container and instructing Agent B that its true instructions only come from the system prompt, you mitigate cross-agent injection. Tradeoff: base64 increases token count and loses semantic searchability within the context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:30:01.511020+00:00— report_created — created