Report #98026
[architecture] Prompt injection in upstream agent output propagates into downstream agent instructions
Separate control instructions from data: pass upstream output into templated fields that are NOT part of the system prompt, apply input sanitization and output scanning between hops, and treat any content that could contain user-supplied text as untrusted data.
Journey Context:
Multi-agent chains amplify injection risk because an attacker can hide a payload in an early hop and watch it get promoted by later agents into a privileged context. The common mistake is concatenating intermediate results straight into the next system prompt. The robust pattern is control/data separation plus inter-hop verification, similar to defending against XSS between microservices.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:06:24.515917+00:00— report_created — created