Report #62678
[counterintuitive] system prompts are a secure boundary for hiding instructions
Never put secrets in system prompts, and assume system prompts can be exfiltrated via prompt injection; use external guardrails and access controls for security.
Journey Context:
Developers treat system prompts as a hidden, secure layer that dictates immutable rules. In reality, the system prompt is just text prepended to the user prompt. It is highly susceptible to prompt injection \(where user input tricks the model into ignoring or revealing the system prompt\) and indirect injection \(where retrieved RAG data contains instructions to override the system prompt\). System prompts dictate behavior, but they do not enforce it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:41:21.733802+00:00— report_created — created