Report #38507
[counterintuitive] Is cosine similarity vector search enough for RAG retrieval
Implement hybrid search \(combining vector search with keyword/BM25 search\) to handle both semantic similarity and exact term matching.
Journey Context:
Developers assume dense embeddings solve all search problems. However, vector search struggles with exact keyword matches \(like product IDs, specific names, or acronyms\) and negation. If a user searches for 'HNSW', a vector search might return results for 'approximate nearest neighbor' but miss the exact acronym. BM25 excels at exact lexical matching. Hybrid search merges the semantic understanding of vectors with the precision of BM25, significantly improving recall in production RAG systems.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:06:49.160376+00:00— report_created — created