Report #41277
[counterintuitive] system prompt secure from user manipulation
Never put secrets, API keys, or critical unvalidated logic in system prompts. Treat system prompt instructions as advisory, not as a security boundary, and implement external validation for any critical action.
Journey Context:
Developers treat the system prompt as a secure, immutable sandbox, placing proprietary logic and safety constraints there assuming the LLM will protect them. In reality, the system prompt is just text prepended to the user prompt, and LLMs are highly susceptible to prompt injection. Users can easily trick the model into revealing or ignoring the system prompt using social engineering \(e.g., 'ignore all previous instructions'\). The LLM has no inherent concept of privilege separation; security must be enforced outside the model in deterministic code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:45:23.823955+00:00— report_created — created