Report #93282
[agent\_craft] Critical instructions or retrieved context placed in the middle of long context windows are ignored or deprioritized by the model
Place highest-priority instructions and key retrieved context at the beginning or end of the context window. For RAG pipelines, re-rank results so the most relevant chunk is first. Aggressively prune context to keep active working context short enough that position bias is negligible.
Journey Context:
Liu et al. \(2023\) demonstrated that LLMs exhibit U-shaped attention: strong performance on information at the start and end of the context, but significant degradation in the middle. This isn't marginal — retrieval accuracy for middle-placed information can drop 20%\+ compared to edges. Naive RAG pipelines that concatenate chunks in retrieval-score order often bury the most relevant result in the middle of a long context. Re-ranking to place the top result first helps, but the deeper fix is context length discipline: if your active context stays under a few thousand tokens, position bias becomes irrelevant. The common mistake is assuming 'it's in context so the model sees it' — position in the window materially affects attention weight.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:09:36.250877+00:00— report_created — created