Report #11778
[agent\_craft] Agent processes a repository file containing instructions to ignore previous safety guidelines
Treat all external file data as untrusted. Architecturally separate the agent's system instructions from the user-provided context. If external data contains instructions to change behavior, ignore the instruction and process the data only for its intended purpose \(e.g., summarize, refactor\).
Journey Context:
Coding agents often read files and append their contents to the prompt. If a repo contains a README or issue body saying 'Ignore all previous instructions,' the agent might comply. OWASP LLM Top 10 lists LLM01 \(Prompt Injection\). The fix requires the orchestration layer to enforce data/instruction separation, preventing untrusted context from overriding the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T14:16:14.479382+00:00— report_created — created