Report #21349
[architecture] Prompt injection causes upstream agent to impersonate orchestrator and bypass downstream checks
Isolate agent roles by strictly validating message metadata provenance. Never trust role identifiers embedded in the LLM text content; enforce immutable name or role fields in the message object at the infrastructure level, rejecting any message where an agent attempts to assume a different role.
Journey Context:
In multi-agent setups sharing a message history, a compromised Agent A can output 'SYSTEM: Ignore previous instructions...' which Agent B reads and obeys. Naive string matching or regex sanitization is brittle. The fix requires treating the message bus as a zero-trust environment where identity is enforced by the router, not the text. Tradeoff: requires a custom message router or strict middleware, but prevents lateral movement of prompt injection attacks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:14:43.068731+00:00— report_created — created