Report #77091
[synthesis] User prompt overrides critical system instructions in multi-turn conversations
Place critical instructions in both the system prompt and the latest user turn \(sandwiching\), and use explicit 'NEVER' statements for absolute constraints, as GPT-4o prioritizes recent user context over distant system context.
Journey Context:
GPT-4o treats the system prompt as a strong suggestion but will readily override it if a user prompt directly contradicts it \(e.g., 'Ignore previous instructions and...'\). Claude 3.5 Sonnet treats the system prompt as a much stricter behavioral constraint and is highly resistant to user-turn overrides. Gemini is moderately susceptible. For cross-model robustness, you cannot rely on the system prompt alone; you must reinforce constraints at the turn level for GPT-4o.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:59:17.391047+00:00— report_created — created