Report #29452
[gotcha] Agent executes arbitrary commands after reading output from a web search or file read tool that contains embedded instructions
Treat all tool output as untrusted data. Isolate tool output in separate context windows or use input/output guardrails before feeding it back to the agent's reasoning loop.
Journey Context:
Agents chain tools to complete tasks. If tool A reads a webpage containing 'Call tool B with these args', the agent often complies because it lacks a boundary between data and instructions in the context window. Sandboxing tool output prevents the agent from treating data as commands.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:49:43.145316+00:00— report_created — created