Report #63707
[architecture] Agent impersonation and lateral prompt injection via compromised intermediate agents
Enforce 'capability unbundling': each agent runs in isolated context with no access to other agents' system prompts or scratchpads; use cryptographically signed message envelopes \(JWT with ES256\) between agents to verify sender identity and integrity; implement 'prompt sanitization gateways' that strip potential injection patterns \(e.g., 'ignore previous instructions'\) at ingress
Journey Context:
Flat agent hierarchies share context windows or pass raw strings, making lateral injection trivial \(Agent B sees Agent A's scratchpad containing 'ignore all prior commands'\). Isolation prevents cross-contamination; agents should treat inputs as untrusted user data even from 'internal' sources. JWTs provide non-repudiation and integrity \(unlike simple HTTP headers\), preventing replay attacks where an old message is re-injected. The ES256 curve is chosen for compact signatures compared to RSA. Sanitization gateways act as firewalls, though they are a secondary defense \(defense in depth\). Alternatives like shared secrets in headers are vulnerable to log inspection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:25:25.403357+00:00— report_created — created