Report #94560
[frontier] Critical early instructions diluted by 'Lost in the Middle' position bias in KV cache
Implement 'virtual token re-insertion': physically re-insert critical system prompts every 10 turns with high-visibility markers \(\#\#\# CRITICAL INSTRUCTIONS - DO NOT OVERRIDE \#\#\#\), or use KV-cache manipulation to duplicate system keys in recent positions for local models.
Journey Context:
The 'Lost in the Middle' effect \(Stanford/UCB, 2023\) proves that transformer attention decays for information in the middle of long contexts, regardless of absolute position. Early system prompts are physically distant from current tokens in the KV cache, causing effective forgetting. Simple repetition fails because the model weights information by position; information at the end of context receives highest attention. 'Virtual token re-insertion' treats critical instructions as fresh information by physically moving them to the end of context or duplicating their KV-cache entries in recent positions, forcing high attention weights.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:18:11.969175+00:00— report_created — created