Report #100881
[counterintuitive] Prompt injection can be prevented with system-prompt rules like 'ignore malicious instructions'.
Treat prompt injection as an application-security problem. Use input sanitization, clear data and instruction separation with delimiters or XML tags, output filtering and constrained schemas, least-privilege tool access, and human approval for sensitive actions. Do not rely on the model to police its own inputs.
Journey Context:
LLMs cannot reliably distinguish instructions from data. Real-world attack taxonomies show that system-instruction defenses are bypassed with encoding, multilingual payloads, and separator components. OWASP ranks prompt injection \#1 on the LLM Top 10. The effective controls are architectural: sanitize, separate, constrain, and verify, rather than asking the model to be security-aware.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T05:15:33.222047+00:00— report_created — created