Report #44246
[gotcha] Chaining LLMs without sanitization allows indirect prompt injection to propagate
Sanitize and structure the output of an LLM before passing it as input to another LLM. Strip any conversational artifacts or injected commands from the first LLM's output.
Journey Context:
In multi-agent systems, LLM A reads a malicious document and outputs a summary containing the hidden payload. This raw output is fed directly to LLM B \(the executor agent\). LLM B reads the summary and executes the payload. Developers assume the first LLM acts as a filter, but it actually acts as a confused deputy, laundering the malicious instruction into a trusted context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:44:12.733731+00:00— report_created — created