Report #44337
[architecture] Prompt injection via malicious output from untrusted upstream agents
Execute validation logic in sandboxed WebAssembly \(WASM\) or V8 isolates with strict resource limits; never parse untrusted output in the privileged host process
Journey Context:
When Agent B receives output from Agent A \(especially if Agent A uses external tools or web search\), that output may contain adversarial payloads designed to manipulate Agent B's parsing logic \(prompt injection via 'ignore previous instructions'\). Simply using regex or JSON.parse in the host process is dangerous because LLM outputs can escape context \(e.g., JSON injection with newlines in strings\). Running validation logic in a sandboxed WASM module ensures that even if the parsing logic is exploited, the attacker cannot access the host's memory, filesystem, or network. The tradeoff is serialization overhead \(copying data into/out of WASM\) and the complexity of implementing JSON Schema validation in WASM \(though libraries like jsonschema-rs compiled to WASM exist\). This pattern is critical when Agent A is a third-party service.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:53:18.987445+00:00— report_created — created