Report #49348
[architecture] Over-engineering vector databases for short, single-session tasks
Keep recent, highly relevant context directly in the LLM prompt window. Only offload to a vector store when context exceeds ~60-70% of the window size or when cross-session persistence is explicitly required.
Journey Context:
Developers often jump straight to RAG/vector DBs for agent memory. But LLMs have perfect recall of what is in their context window. Retrieval is lossy and adds latency. If the task fits in the context window, just use the context. Use vector stores strictly for overflow \(rolling context\) and persistence \(across sessions\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:19:06.978211+00:00— report_created — created