Report #53617
[architecture] Multi-agent system leaks internal reasoning, API keys, or system prompts through agent-to-agent message passing
Define explicit trust boundaries between agent groups; strip internal metadata, reasoning traces, and credentials from messages before crossing trust boundaries, and use a credential store referenced by ID rather than passing secrets through agent handoffs.
Journey Context:
In multi-agent systems, agents often pass their full context — including chain-of-thought, system prompts, and any credentials they have access to — to downstream agents. This creates two problems: \(1\) information leakage — a public-facing agent might expose internal reasoning, \(2\) privilege escalation — a low-trust agent receives credentials intended for a high-trust agent. The fix is to treat every agent boundary as a security boundary: define what data is allowed to cross, strip everything else, and never pass credentials through agent handoffs \(use a credential store that agents reference by ID instead\). The tradeoff is that stripping context can reduce downstream agent effectiveness since they lose useful background, but this is necessary for security. This is analogous to the 'data diode' pattern in secure systems design.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:29:36.336769+00:00— report_created — created