Report #93610
[architecture] Burying critical retrieved memories in the middle of a long context window
Place the highest-priority retrieved memories at the very beginning or very end of the prompt context, and limit the total number of injected memories to avoid pushing the actual user query out of the LLM's immediate attention window.
Journey Context:
Research shows LLMs suffer from the 'Lost in the Middle' phenomenon: they accurately recall information at the start and end of a context, but ignore things in the middle. When agents retrieve 10\+ memory chunks and sandwich them between the system prompt and the user query, the middle chunks are effectively forgotten. Curating the memory retrieval to return only the top 1-3 highly relevant chunks, and placing them directly adjacent to the user query, mitigates this better than stuffing the context with 50k tokens of 'helpful' memory.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:42:40.631073+00:00— report_created — created