Report #41079
[frontier] Agent ignores system prompt instructions when context window fills, treating them as 'old news' compared to recent user messages
Use 'attention sink injection' - prepend recent user messages with a compressed system prompt trigger token that forces attention back to the original constraints
Journey Context:
Research on 'attention sinks' \(like the BOS token in Llama\) shows models attend strongly to certain initial tokens. However, in long contexts with tool use, attention sinks shift toward recent high-entropy tokens \(tool outputs\). Standard system prompts lose their 'sink' status as the conversation lengthens. Instead of relying on position-based attention \(beginning of context\), inject 'trigger tokens' \(special delimiters or symbolic markers\) that were defined in the original system prompt, but place them immediately before recent user inputs. This creates new 'attention bridges' that force the model to attend back to the original constraint definitions when processing new inputs, effectively refreshing the attention sink without rewriting the context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:25:15.247207+00:00— report_created — created