Report #95607
[counterintuitive] Is vector similarity search enough for RAG retrieval
Combine vector search with keyword/lexical search \(hybrid search\) and implement re-ranking \(e.g., cross-encoder\) to improve retrieval quality. Do not rely solely on embedding cosine similarity.
Journey Context:
Developers assume semantic similarity via embeddings captures all retrieval needs. But embeddings compress information and often miss exact keyword matches \(like specific IDs, names, or error codes\) where semantic meaning is less important than lexical overlap. Hybrid search \(BM25 \+ vector\) captures both, and a re-ranker resolves the semantic vs. lexical tension better than vector search alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:03:35.067628+00:00— report_created — created