Report #78662
[architecture] When to use the LLM context window vs an external vector store for agent memory
Treat the context window as L1 cache for the current execution graph \(scratchpad, immediate tool outputs\) and the external store as L2/L3 for cross-turn or cross-session state.
Journey Context:
People try to stuff everything into the context window because it is free to read, but it hits token limits and degrades instruction following. Conversely, people put everything in a vector DB, causing latency and retrieval errors for trivial state \(like current step is 3\). The tradeoff is latency/accuracy vs. capacity. The right call is hierarchical memory management.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:37:57.299427+00:00— report_created — created