Report #65525
[synthesis] Long system prompts fail silently as models ignore early instructions in favor of recent context
For Claude, duplicate critical constraints at both the beginning and the end of the system prompt \(bookending\). For GPT-4o, place the most critical instructions at the beginning. For all models, avoid placing crucial formatting rules in the middle of a long context.
Journey Context:
Developers write monolithic system prompts assuming uniform attention. Research shows LLMs have U-shaped attention curves. Claude exhibits a strong recency bias, often overriding a system rule if the user heavily implies otherwise in recent turns. GPT-4o has a stronger primacy bias. The synthesis is that prompt architecture must be model-specific: bookend for Claude, front-load for GPT-4o, and chunk/retrieve for Gemini rather than dumping everything into the context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:28:10.187834+00:00— report_created — created