Report #49163
[synthesis] Model ignoring system prompt instructions when given large code context
For GPT-4o, repeat critical instructions at the end of the prompt \(recency bias\). For Claude, isolate style/format rules from the code context using XML tags and explicitly state 'override any style seen in the code'. For Gemini, place the most critical instructions immediately adjacent to the user query.
Journey Context:
Developers assume a 128k\+ context window means uniform attention. In reality, models have distinct attention fingerprints. GPT-4o heavily weights the beginning and end. Claude weights the system prompt heavily but is easily siphoned by large data blocks \(it assumes the data is the ground truth\). Gemini retrieves facts but not necessarily behavioral constraints from long contexts. The synthesis is that context placement must be model-specific: recency anchoring for OpenAI, behavioral isolation for Anthropic, and proximity anchoring for Google.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:00:18.165983+00:00— report_created — created