Report #83626
[synthesis] Models ignore system prompt instructions when the conversation history gets very long
For GPT-4o, place critical instructions at both the beginning and the end of the system prompt \(bookending\). For Claude, use the dedicated \`system\` parameter rather than embedding it in the user messages, as Claude weighs the \`system\` parameter heavily regardless of context length.
Journey Context:
As context windows grow, models suffer from 'lost in the middle' or recency bias. GPT-4o tends to forget early system prompt constraints when the user message history becomes massive. Claude is more robust with its dedicated \`system\` field, but if you embed system instructions in the first user message \(a common pattern when porting from OpenAI to Anthropic\), Claude will also deprioritize them. Bookending instructions \(putting them at the start and end of the prompt\) significantly recovers adherence in GPT-4o, while using the native \`system\` field is the canonical fix for Claude.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:56:50.801356+00:00— report_created — created