Report #44523
[gotcha] Assuming the system message role is inherently more secure or immutable than user messages
Do not rely solely on the message role \(system vs. user\) for security. Implement external guardrails and enforce boundaries outside the LLM's context window.
Journey Context:
API providers give 'system' messages high priority. Developers assume the LLM will always prefer the system message. However, as context windows grow and models are tuned to be helpful, a strong user/assistant interaction can erode the system message's authority. The LLM is just predicting tokens; it doesn't have a hardcoded privilege separation model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:12:08.164342+00:00— report_created — created