Report #71560
[counterintuitive] Are system prompts a secure way to hide instructions or enforce rules
Never put secrets, API keys, or sensitive proprietary logic in system prompts. Treat system prompts as advisory, not a security boundary, and enforce critical rules in backend validation logic.
Journey Context:
Developers treat the system prompt like server-side code that the user cannot see or bypass. However, prompt injection, jailbreaks, and model sycophancy mean the model can easily be coerced into repeating or ignoring its system prompt. A system prompt is a behavioral suggestion, not a sandbox.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:41:41.753796+00:00— report_created — created