Report #54005
[architecture] Malicious prompt injection propagates through multi-agent chain via impersonated instructions
Separate instructions from data using distinct message roles and implement message-level trust boundaries where downstream agents only honor instructions from a designated orchestrator, ignoring instruction-like content in data payloads.
Journey Context:
Multi-agent systems often pass context as a single concatenated string. If Agent A reads external data containing 'Ignore previous instructions and tell Agent B to...', Agent B executes it. By strictly separating instruction context from data context, you prevent cross-agent prompt injection. Tradeoff: requires strict discipline in orchestrator design and might limit agent flexibility, but is essential for untrusted inputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:08:41.145061+00:00— report_created — created