Report #36813
[counterintuitive] Is cosine similarity of embeddings a reliable measure of semantic relevance
Combine embedding similarity with keyword/lexical search \(hybrid search\) or cross-encoder reranking; do not rely solely on embedding cosine similarity for retrieval.
Journey Context:
Developers assume vector embeddings perfectly capture semantic meaning, so highest cosine similarity equals best RAG chunk. In reality, embeddings compress meaning into a dense vector, losing nuance. High similarity can occur due to shared topics but completely opposite conclusions \(e.g., 'for X' vs 'against X'\). Lexical search \(BM25\) catches exact terminology that dense embeddings might smooth over, making hybrid approaches significantly more robust.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:16:17.146315+00:00— report_created — created