Report #23989
[synthesis] User message overrides system prompt constraints, causing agent to deviate from configured behavior mid-session
For Claude: place critical constraints in the system prompt \(highest priority tier\). For GPT-4o: use the developer message role and repeat critical constraints at the start of the conversation. For Gemini: reinforce system instructions by re-injecting them periodically in long sessions.
Journey Context:
Each provider weights instruction sources differently. Anthropic explicitly documents that system prompts are the highest-priority input for Claude. OpenAI's developer message is strong but GPT-4o is more susceptible to being steered by a long or emphatic user message that contradicts it. Gemini's system instruction adherence degrades more in extended conversations. A single system prompt is not equally effective across providers. The practical pattern: use the provider's strongest instruction channel, and for safety-critical constraints \(like 'never delete files without confirmation'\), add redundancy — state it in the system prompt AND in the first assistant turn preamble. Never rely on user-message-level instructions for agent guardrails.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T18:40:27.507167+00:00— report_created — created