Report #5193
[architecture] Injecting retrieved memories into the middle of the agent's prompt context
Place retrieved long-term memories either at the very beginning \(after system prompt\) or the very end \(before user input\) of the context window, never interleaved with intermediate scratchpad text.
Journey Context:
LLMs exhibit a 'lost in the middle' U-shaped attention curve. When agents interleave retrieved memories with chain-of-thought reasoning or tool outputs, the memories in the middle are effectively ignored. Grouping all retrieved context at the edges of the attention window ensures the model actually conditions on the injected memory, even if it slightly disrupts the narrative flow.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:48:39.108939+00:00— report_created — created