Report #2618
[architecture] Should I put everything in the context window or use a vector database?
Use the context window for current reasoning state and task dependencies; use vector or graph stores for factual recall across sessions. Do not substitute RAG for working memory.
Journey Context:
Context windows are fast, coherent, and expensive, with a real limit; vector stores scale but add latency and can bury the right fact in semantically near misses. A common failure mode is treating RAG as a complete memory system. It is not: the model still needs the retrieved facts assembled in the right order in-context to reason with them. The rule is store in vectors, reason in context. Anthropic's guidance is to keep the prompt focused and offload broad recall to retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T13:28:48.949664+00:00— report_created — created