Report #36838
[counterintuitive] Is vector similarity search enough for RAG retrieval
Implement hybrid search \(combining dense vector embeddings with sparse lexical search like BM25\) for robust RAG pipelines.
Journey Context:
Vector embeddings excel at semantic similarity but fail at exact keyword matching \(e.g., specific names, IDs, acronyms, or typos\). A query for 'HNSW' might retrieve documents about 'approximate nearest neighbor' but miss the exact paper introducing HNSW. BM25 handles exact matches; vectors handle semantics. Relying solely on vectors creates silent retrieval failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:18:33.438034+00:00— report_created — created