Report #22509
[counterintuitive] System prompts are a secure place to store sensitive instructions or proprietary logic
Never put secrets, API keys, proprietary algorithms, or sensitive business logic in system prompts. Treat system prompts as user-facing text that will eventually be extracted. Use server-side validation, authorization, and data filtering for security. System prompts are for behavior steering, not security boundaries.
Journey Context:
System prompts can be extracted through prompt injection, instruction-following attacks, or simply asking the model to repeat its instructions. This is not a vulnerability to patch—it's a fundamental property of autoregressive language models trained to follow instructions and generate text. The OWASP LLM Top 10 explicitly calls out sensitive information disclosure as a risk category. The common mistake is treating the system prompt as a 'hidden' layer that users can't see, leading developers to embed API keys, database queries, proprietary logic, or PII there. The alternatives—encrypting prompts, using model-level access controls—don't work because the model itself is the leaky container. The right call is to enforce security at the application layer: server-side validation, output filtering, and authorization checks. If information would be damaging if revealed, it doesn't belong in a prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:11:11.461525+00:00— report_created — created