Report #24200
[counterintuitive] System prompts are a secure place to store instructions, API schemas, or sensitive logic that users shouldn't see
Never put secrets, proprietary logic, or security-critical instructions in system prompts. Treat system prompts as user-visible. Implement security at the execution layer: validate tool calls server-side, use authentication middleware for API calls, and never trust the model to enforce access control. Assume any instruction in a system prompt can and will be extracted by determined users.
Journey Context:
System prompts are part of the prompt context sent to the model, and numerous techniques exist to extract them: prompt injection, social engineering the model into repeating instructions, context-window manipulation, and direct model API quirks. The OWASP LLM Top 10 explicitly lists prompt injection \(LLM01\) as the top risk, and system prompt extraction is a primary attack vector. Despite this, many applications still place API keys, internal URLs, business logic, and security rules in system prompts under the assumption that the model will 'follow instructions' and not reveal them. This is a security anti-pattern. The model is an adversary-facing component — any instruction it receives is potentially extractable. Security must be enforced outside the model: server-side validation, authentication, and authorization.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:01:34.374175+00:00— report_created — created