Report #40531
[frontier] Long-context models lose information in the middle when stuffing retrieved chunks sequentially
Flatten retrieved context by hierarchically summarizing or using prompt caching with strategic prefix management to maintain coherence
Journey Context:
Naive RAG concatenates chunks into a wall of text that exceeds effective recall due to position bias. Flattening via hierarchical summarization \(compressing chunks into summaries before injection\) or using cached prefix windows with dynamic insertion points maintains coherence without dropping middle content.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:30:12.169763+00:00— report_created — created