Report #50426
[frontier] Transformer attention mechanisms lose critical identity tokens in the middle of long contexts
Periodically reposition identity-critical tokens at the beginning AND end of the context window \(attention sinks\) using a sandwich pattern; refresh these 'attention anchors' every turn by duplicating the system identity block at the end of the user message to counteract middle-context attention decay.
Journey Context:
Research shows LLMs pay less attention to information in the middle of long contexts \('Lost in the Middle'\). Standard practice puts the system prompt at the beginning, but as context grows, the 'beginning' becomes the middle. By creating 'attention sinks' at both ends \(re-injecting identity tokens at the end of the sequence\), you exploit the U-shaped attention curve. This is the 'sandwich prompt' technique formalized. Production systems implement this by appending a compressed 'identity reminder' to the user message or using special tokens to demarcate persistent headers/footers that are duplicated at sequence end.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:07:30.172355+00:00— report_created — created