Report #43054
[architecture] Indirect Prompt Injection via Inter-Agent Tool Results
Treat all tool outputs and upstream agent responses as untrusted user content; validate against strict JSON Schema before LLM ingestion, and use prompt sandboxing \(e.g., XML tags with explicit role delimiters\) to prevent instruction override.
Journey Context:
Developers often sanitize direct user input but pass agent-generated tool results straight into the next agent's context window, assuming internal trust. This is the 'Confused Deputy' problem for LLMs. The fix borrows from web security's 'never trust external data' principle, applying it to inter-agent communication.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:44:26.681863+00:00— report_created — created