Report #62289
[architecture] Storing entire conversation state in vector database
Keep the immediate working set \(last N turns, current task state\) in the context window. Use the vector store only for cross-session facts, episodic memory, and semantic knowledge. Treat context window as RAM and vector store as disk.
Journey Context:
Developers often try to RAG the current conversation to save tokens. This introduces latency and retrieval errors for things the LLM should just see directly. The context window is for active reasoning; vector DB is for long-term recall. Moving active working memory to a vector DB causes severe state fragmentation and lost coherence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:02:17.977644+00:00— report_created — created