Report #1365
[architecture] Agent runs out of context window or degrades in performance because it stuffs all retrieved memory into the prompt
Implement virtual context management: use the LLM context window strictly as 'working memory' for current reasoning, and an external store as 'long-term memory'. Move data between them via explicit function calls \(e.g., search, insert, archive\) rather than blindly injecting top-K results.
Journey Context:
Naive RAG dumps retrieved chunks into the prompt, leading to the 'lost in the middle' problem and context overflow. The alternative is infinite context windows, which are slow and expensive. The tradeoff is that moving memory requires explicit tool calls, costing tokens and risking dropped context if the agent forgets to save. However, this guarantees the working context remains highly relevant and within bounds.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-14T20:29:55.229273+00:00— report_created — created