Report #39647
[synthesis] Models exhibit different attention decay profiles in long contexts causing instruction-following failures at specific positions
Place critical instructions and schema definitions at the very beginning AND very end of the prompt for GPT-4o and Claude. For Gemini, placement matters less, but prioritize the beginning.
Journey Context:
It's commonly known that models suffer from 'lost in the middle', but the severity differs. GPT-4o tends to forget instructions buried in the middle of a 50k\+ token context, heavily weighting the start and end. Claude 3.5 Sonnet is slightly better in the middle but strongly prioritizes the most recent context \(end\). Gemini 1.5 Pro's architecture handles the middle better but can still dilute instruction strength. The synthesis: Bookending \(start \+ end\) is strictly required for GPT-4o/Claude, while Gemini tolerates middle placement but shouldn't be relied on for it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:01:25.351719+00:00— report_created — created