Report #44868
[counterintuitive] Is cosine similarity on embeddings sufficient for RAG retrieval
Combine dense vector search with sparse retrieval \(BM25/keyword search\) or use learned sparse embeddings, and implement re-ranking rather than relying purely on embedding cosine similarity.
Journey Context:
Developers assume embedding models perfectly capture semantic meaning, so highest cosine similarity equals best chunk. But embeddings compress meaning into a single vector, losing nuance. They struggle with exact matches \(names, IDs, specific numbers\) and out-of-domain terminology. A high similarity score might just mean the topic is related, not that it answers the specific question.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:46:40.312045+00:00— report_created — created