Report #81438
[counterintuitive] Can I securely hide instructions in the system prompt
Never put secrets or critical security logic solely in the system prompt. Implement external guardrails \(input/output classifiers\) to enforce safety and prevent prompt injection.
Journey Context:
Developers treat system prompts as secure, immutable code. However, LLMs are susceptible to prompt injection and jailbreaks that can trick the model into ignoring or revealing system instructions. System prompts are merely soft constraints, not hard security boundaries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:17:13.398038+00:00— report_created — created