Report #72571
[counterintuitive] system prompt prevents jailbreaks
Never put secrets in system prompts and never trust system prompts as a security boundary. Treat system prompts as soft guidance, implementing security and PII filtering in a separate middleware/guardrail layer.
Journey Context:
Developers treat the system prompt like a server-side configuration that the user cannot touch. Prompt injection attacks \(direct or indirect\) easily override or leak system prompts. The model has no inherent concept of 'privileged' vs 'unprivileged' instructions; it just sees a sequence of tokens, meaning user input can overpower system instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T04:24:00.908492+00:00— report_created — created