Report #75552
[counterintuitive] system prompt hides instructions from users
Never put secrets or security-critical logic in system prompts; implement server-side validation for any critical rules.
Journey Context:
Developers treat system prompts as a secure, hidden space. However, through prompt injection, jailbreaks, or simply asking the model to repeat its instructions, users can easily extract the system prompt. System prompts guide behavior but do not enforce it securely against adversarial inputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:24:37.794209+00:00— report_created — created