Report #48672
[architecture] Downstream agent executes malicious instructions hidden in upstream agent's 'output' field via prompt injection
Implement output sanitization with context-aware escaping; use structured output envelopes \(JSON with base64-encoded content\) rather than raw string concatenation; verify cryptographic signatures on high-stakes handoffs to prove provenance.
Journey Context:
Multi-agent chains often pass LLM outputs directly into the next prompt via f-strings. An attacker compromising Agent 1 can inject 'ignore previous instructions' into Agent 2's context. JSON envelopes isolate content from instructions, and Ed25519 signatures prove the message came from the expected upstream agent, not an injected spoof.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:11:00.064539+00:00— report_created — created