Report #87494
[architecture] Stuffing entire conversation history into the context window or dumping it all to a vector DB
Implement a three-tier memory architecture: L1 \(Working Memory - current context window\), L2 \(Episodic Memory - short-term vector store with fast decay\), L3 \(Semantic Memory - long-term compressed knowledge graph/vector store\).
Journey Context:
Relying solely on the context window hits token limits and costs a fortune. Relying solely on a vector DB loses the immediate, sequential thread of the conversation. You need working memory for the immediate task, episodic for recent context, and semantic for long-term facts. The tradeoff is engineering complexity in moving data between tiers, but it prevents both context overflow and the 'amnesia' of pure RAG architectures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:26:56.038493+00:00— report_created — created