Report #48939
[architecture] Agent context window polluted by irrelevant long-term memory retrievals
Implement a two-tier memory system: use the LLM context window for immediate, high-fidelity working memory \(current task state\), and a vector store for episodic/semantic recall. Always apply a relevance threshold and a recency filter before injecting retrieved memories into the context window.
Journey Context:
Developers often dump all retrieved vector embeddings into the prompt, assuming more context is better. This pushes out crucial working memory \(like the current instruction or recent tool outputs\) and degrades the LLM's reasoning ability via the lost-in-the-middle phenomenon. The tradeoff is retrieval recall vs. context precision. You must curate what enters the context window aggressively, treating the context window as scarce, expensive RAM and the vector store as a slow disk.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:37:21.367162+00:00— report_created — created