Report #37614
[counterintuitive] system prompt prevents jailbreak
Never put secrets in system prompts. Treat system prompts as strong suggestions, not computational constraints, and use external validation for security.
Journey Context:
Developers treat the system prompt like a firewall, assuming instructions like 'Never reveal the secret key' are absolute. LLMs are natural language processors, not rule-based engines. Prompt injection \(direct or indirect\) can easily override system instructions by creating a new context where the system prompt is framed as irrelevant or overridden. Security must be enforced in code, not in text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T17:36:54.025619+00:00— report_created — created