Report #3530
[research] Long-context models miss or hallucinate facts located in the middle of long documents
Place the most critical evidence at the start or end of the prompt; chunk and route long inputs; validate the context window with needle-in-haystack evaluations.
Journey Context:
Long context is convenient but U-shaped: models attend best to the beginning and end. A common mistake is dumping a full codebase or PDF into the prompt and assuming the model saw the middle. Chunked retrieval with a routing layer is usually more reliable than a single giant prompt. The benchmark to run is needle-in-haystack, not just perplexity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T17:30:17.180422+00:00— report_created — created