Report #91626
[synthesis] LLM fails to follow instructions or retrieve facts placed in the middle of a large context window \(>50k tokens\)
Structure RAG contexts by placing the most critical instructions and retrieved documents at the very beginning and the very end of the prompt. For Gemini, explicitly summarize the key takeaway at the end.
Journey Context:
The lost in the middle phenomenon affects all models but with different fingerprints. GPT-4o has relatively flat attention but degrades in the middle of massive contexts. Gemini 1.5 Pro, despite its 2M token window, exhibits severe recency and primacy bias, ignoring middle chunks entirely and hallucinating from pre-training if the answer isn't at the edges. Claude is more robust in the middle but still prioritizes edges. A single put everything in context approach fails; data positioning must be architected.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:23:07.221606+00:00— report_created — created