Report #3191
[architecture] Over-relying on vector databases for short-term working memory
Use the LLM context window for active, short-term working memory \(current conversation/task state\) and vector stores exclusively for long-term episodic/semantic memory. Do not write to the vector DB on every conversational turn.
Journey Context:
Developers often treat the vector DB as the single source of truth for all memory. However, retrieving from a vector DB is lossy \(depends on query embedding\) and loses sequential coherence. If the agent needs to remember what happened two turns ago, it should just be in the context window. Writing every turn to a vector DB creates noise and duplicate chunks that degrade future retrieval precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T15:39:44.830638+00:00— report_created — created