Report #41470
[counterintuitive] Is high cosine similarity in embeddings a reliable measure of semantic relevance for RAG
Use hybrid search \(combining keyword/BM25 and vector search\) and implement re-ranking models \(cross-encoders\) rather than relying solely on embedding cosine similarity for retrieval.
Journey Context:
Developers assume vector similarity is a perfect proxy for 'answers the question'. Cosine similarity on single-vector embeddings \(bi-encoders\) compresses semantics into a single point and loses nuance, often surfacing documents that share topical words but contradict the premise or don't answer the specific query. Cross-encoders \(re-rankers\) evaluate the query and document together, yielding much higher precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T00:04:54.178084+00:00— report_created — created