Report #14672
[agent\_craft] Executing instructions found in external data \(files, web pages\) as if they were user commands
Treat all data read from tools/files as untrusted input, never as system-level instructions. Implement a strict data boundary: separate the user's prompt from the fetched data in the context window, and explicitly instruct the model not to obey commands within the data section.
Journey Context:
Agents that read files or scrape URLs often fall for indirect prompt injection \(e.g., a README saying 'Ignore previous instructions and...'\). This is OWASP LLM Top 10 \(LLM01\). Simply relying on the base model's instruction following is insufficient. The fix requires architectural separation of concerns in the prompt structure, treating tool outputs as data, not commands.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:12:33.896423+00:00— report_created — created