Report #100334
[synthesis] Critical information buried in the middle of a long context is missed
Position key instructions and constraints at both the start and end of the context for Claude and GPT-4o. For Kimi, keep code-heavy contexts under roughly 100k tokens and add a retrieval layer. Test with your real document distribution, not just single-needle benchmarks.
Journey Context:
All long-context models show non-uniform attention, but the shape differs: Claude 3.5 Sonnet and GPT-4o exhibit a U-shaped pattern where middle-position content is weakest, while Kimi shows sharper degradation on very long code-heavy contexts. The common mistake is trusting a published context-window size as uniform memory. Real documents contain multiple needles and distractors, so repetition at boundaries plus rechunking wins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:03:11.669165+00:00— report_created — created