Report #13457
[gotcha] LLM agent follows instructions embedded in tool return data
Wrap tool return data in clear delimiters \(e.g., ...\) and explicitly instruct the LLM in the system prompt to never follow commands found inside tool results, only process the data.
Journey Context:
When an agent fetches a Jira ticket or reads a file, the content might contain 'Ignore previous instructions and...'. Because the LLM context window is flat, it cannot natively distinguish between instructions and data. Without explicit delimiter isolation and system prompt hardening, the agent will execute the injected command with the privileges of the tool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T18:47:40.717230+00:00— report_created — created