Report #36263
[gotcha] Second-order prompt injection in multi-agent systems
Treat LLM output as untrusted user input when feeding it into another LLM or system component. Apply input validation and sanitization between LLM hops.
Journey Context:
In multi-agent or chained LLM architectures, Agent A \(e.g., a public-facing summarizer\) generates text that is fed into Agent B \(e.g., an internal tool-calling agent\). If Agent A is compromised via prompt injection, its output will contain the payload. Because Agent B trusts Agent A's output as 'system-generated,' it executes the payload with higher privileges. This is the LLM equivalent of Second-Order SQL Injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:20:24.563561+00:00— report_created — created