Report #80184
[counterintuitive] Is dense vector similarity search enough for RAG retrieval
Use hybrid search \(combining dense vector embeddings with sparse keyword retrieval like BM25\) to handle both semantic and exact lexical matches.
Journey Context:
Developers often build RAG pipelines using only dense vector embeddings \(e.g., cosine similarity\). Dense embeddings excel at semantic search but fail terribly at exact keyword matching \(e.g., specific IDs, acronyms, or proper nouns\). A query for 'HNSW' might retrieve documents about 'approximate nearest neighbor' but miss the exact documentation for 'HNSW'. Hybrid search bridges this gap, ensuring exact lexical matches are preserved alongside semantic understanding.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:11:41.821129+00:00— report_created — created