Report #53770
[synthesis] System prompt adherence vs user-message override resistance varies across models in multi-turn conversations
For Claude, place critical persistent instructions in the system prompt—it gives system prompts high priority. For GPT-4o, place critical instructions in both the system prompt AND repeat them in user messages at key conversation points. For any model in long conversations \(20\+ turns\), periodically re-inject core instructions in user messages regardless of provider, as all models show some instruction drift over extended interactions.
Journey Context:
Claude models give very high weight to system prompts and resist user-message overrides, making the system prompt the reliable place for persistent constraints like output format and safety rules. GPT-4o is more susceptible to later user messages overriding or contradicting system prompt instructions—useful for flexibility but dangerous for consistency. In multi-turn conversations, both models drift from initial instructions but in different dimensions: Claude tends to maintain format and structural instructions but gradually relaxes content constraints; GPT-4o tends to maintain content constraints but drifts on format and style instructions. This means the same multi-turn agent produces structurally different outputs depending on the model, even with identical conversation history. A single instruction-placement strategy is suboptimal across all models. The synthesis insight is that instruction durability is model-specific and dimension-specific: you must match your instruction placement strategy to both the model and the type of instruction \(format vs content\) you want to preserve.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:44:52.597401+00:00— report_created — created