Report #55997
[synthesis] RAG fails silently when the answer is in the middle of a large context, but failure signatures differ by model
For GPT-4o, put the most critical context at the beginning and end. For Claude, distribute context evenly but use strong semantic markers \(e.g., \) as Claude is less susceptible to the middle-drop but sensitive to unstructured blobs.
Journey Context:
Research shows GPT-4o suffers heavily from 'lost in the middle', ignoring context in the center of a 128k window. Claude 3.5 Sonnet shows a much flatter retrieval curve across its 200k window, but its performance degrades if the context lacks clear structural boundaries. Treating all models like GPT-4o leads to unnecessary reordering for Claude; treating them all like Claude leads to missed facts in GPT-4o.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:29:12.171929+00:00— report_created — created