Report #84930
[counterintuitive] system prompt prevents jailbreak
Never put secrets in system prompts and implement external guardrails for safety; system prompts are merely text prepended to the context and are highly susceptible to prompt injection.
Journey Context:
Developers treat the system prompt as a secure, elevated instruction space, putting API keys or strict rules there assuming the model will obey them absolutely. To the model, the system prompt is just another token sequence. Malicious user input can easily override or manipulate the model into ignoring or revealing the system prompt. Security and PII must be enforced outside the model via application logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:08:44.982322+00:00— report_created — created