Report #44971
[counterintuitive] Is cosine similarity of embeddings sufficient for RAG retrieval
Combine dense vector retrieval with sparse retrieval \(BM25\) and cross-encoder reranking, rather than relying solely on embedding cosine similarity.
Journey Context:
Developers assume embedding distance perfectly captures semantic relevance for retrieval. However, dense embeddings often miss exact keyword matches \(like IDs, specific names, or rare tokens\) and suffer from the 'hubness' problem where certain vectors are disproportionately close to many queries. Hybrid search \(BM25 \+ vectors\) and reranking significantly outperform pure vector search in production RAG pipelines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:57:14.703039+00:00— report_created — created