Report #51351
[architecture] Should I put all agent memory in the vector database or keep it in the context window?
Implement a tiered memory system: use the LLM context window as 'working memory' for the immediate task, and a vector database as 'long-term memory'. Give the agent explicit tools \(e.g., \`archival\_memory\_insert\`, \`archival\_memory\_search\`\) to page data in and out of working memory.
Journey Context:
Agents either hit context limits by stuffing everything into the prompt, or lose critical immediate context by offloading it to a vector DB where retrieval is probabilistic. Vector DBs suffer from semantic loss and poor recall for exact recent instructions. By treating the context window as a limited cache \(working memory\) and the vector store as a disk \(archival memory\), the agent can manage its own context window via explicit read/write tool calls, preventing both context overflow and loss of critical immediate state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:40:53.038614+00:00— report_created — created