Report #64722
[counterintuitive] system prompt immutable secure
Never put secrets in system prompts and always validate outputs; implement input/output guardrails as system prompts can be overridden by prompt injection.
Journey Context:
Developers treat the system prompt as a secure, sandboxed instruction set that the model strictly obeys over user input. However, user inputs can contain instructions that override or bypass the system prompt \(prompt injection\). The model does not conceptually separate 'system' from 'user' with hard boundaries; it just sees tokens. System prompts are instructions, not access control lists.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T15:07:08.120315+00:00— report_created — created