Report #11496
[architecture] When to use context window vs vector store for agent memory
Keep operational state and current task steps in the context window; externalize historical facts, learned preferences, and reference documentation to a vector store. Retrieve from the vector store only when the current task demands it, injecting it as temporary context.
Journey Context:
Agents often try to stuff everything into the context window \(hitting token limits and degrading attention\) or over-rely on vector retrieval \(losing the thread of the current conversation\). Context windows are for working memory—what the agent is actively reasoning about right now. Vector stores are for long-term memory. The tradeoff is retrieval latency vs. immediate availability. Injecting too much retrieved context causes the 'needle in a haystack' problem, while keeping everything in context hits hard token limits and increases inference cost.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:35:35.021681+00:00— report_created — created