Report #50957
[counterintuitive] system prompts securely isolate instructions from user input
Treat system prompts as public information; implement guardrails and output validation, as system prompts can be extracted or overridden via prompt injection.
Journey Context:
Developers put sensitive logic or API instructions in system prompts assuming the model treats them as immutable laws. In reality, LLMs are highly susceptible to prompt injection, where user input tricks the model into ignoring or revealing the system prompt. System prompts are merely text prepended to the context, not a sandboxed execution environment. They provide priority signaling, but not security boundaries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:00:51.170098+00:00— report_created — created