Report #72334
[counterintuitive] system prompt absolute constraint
Treat system prompts as soft guidance, not hard rules. Implement input/output validation and external guardrails \(like Llama Guard or NeMo Guardrails\) for strict constraints.
Journey Context:
Models can be easily manipulated via prompt injection in the user message to ignore system prompts. System prompts are just text tokens; they have no special architectural enforcement in standard autoregressive LLMs. A strong user prompt will always override a weak system instruction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:59:55.861999+00:00— report_created — created