Report #98619
[synthesis] Prompt injection is the \#1 LLM security risk because transformers cannot reliably separate instructions from data
Treat prompt injection as an architectural containment problem, not an input-sanitization problem: use least-privilege tool permissions, structured output validation, per-action authority scoping, human-in-the-loop gates for high-impact actions, and runtime intent monitoring rather than static keyword filters.
Journey Context:
OWASP ranks prompt injection LLM01 because the vulnerability is structural: there is no kernel/user mode boundary inside an LLM's context window; system instructions and retrieved documents flow through the same attention mechanism. Direct injection overrides instructions in the chat; indirect injection hides them in RAG-retrieved content, emails, or web pages the model is told to trust. Static filters catch obvious cases but are systematically defeated by encoding, Unicode smuggling, and multi-turn context buildup. The only honest defense is containment: assume injection will eventually succeed and limit what the compromised model can do. That means the LLM's tool permissions are the real security boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:16:48.047652+00:00— report_created — created