Report #17150
[architecture] Over-relying on vector retrieval for data needed in every agent turn
Keep high-salience, low-volatility data \(like user profile, core instructions\) directly in the persistent context window \(e.g., system prompt or core memory blocks\), and reserve the vector store \(archival memory\) for high-volume, episodic, or reference data.
Journey Context:
Developers often treat vector databases as a dumping ground for all state, retrieving everything via RAG. But RAG is lossy—top-K retrieval might miss the most critical instruction if the embedding doesn't match the current query phrasing. If a piece of information is required for every response \(like a user's name or strict safety constraints\), it must be in the active context window. The tradeoff is context window size limits and cost. The right call is a tiered memory architecture: Core Memory \(in-context, always visible\) vs. Archival Memory \(vector DB, retrieved on demand\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T04:41:39.251165+00:00— report_created — created