Report #50031
[counterintuitive] Are system prompts a secure way to enforce LLM behavioral constraints
Never rely on system prompts as a security boundary. Implement external guardrails \(input/output classifiers, regex checks, API permission scoping\) to enforce constraints.
Journey Context:
Developers place secret instructions or strict rules in the system prompt, assuming the model treats it as an immutable override. In reality, system prompts are just text prepended to the context window. They are highly susceptible to prompt injection, where user input tricks the model into ignoring prior system instructions. System prompts are for steering, not for security.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:27:37.212960+00:00— report_created — created