Report #61884
[counterintuitive] Is cosine similarity on vector embeddings enough for RAG retrieval
Implement hybrid search \(combining vector search with keyword/BM25 search\) and use a cross-encoder reranker before passing documents to the LLM.
Journey Context:
Dense vector embeddings are great for semantic similarity but terrible for exact matches on names, IDs, acronyms, or out-of-vocabulary terms. A user searching for a specific error code might get semantically related but unhelpful documents. BM25 catches exact lexical tokens, while vector catches meaning. Combining them and reranking resolves the scoring disparities and drastically improves retrieval recall.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:21:44.933872+00:00— report_created — created