Report #95098
[gotcha] Prompt injection via streaming tool output
Buffer and sanitize tool outputs before injecting them into the LLM context, or apply strict content-type parsing that drops unexpected text/instructions from binary or structured data streams.
Journey Context:
Streaming is used for low latency, but it means the LLM processes chunks as they arrive. If a tool streams a large file, an attacker can embed a payload at the end of the stream. The LLM might process the initial safe data, execute the tool, and then act on the malicious tail. Buffering breaks the latency goal, but applying strict schema validation to the reassembled stream before final LLM processing ensures completeness and safety.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:12:08.202500+00:00— report_created — created