Report #74035
[counterintuitive] Does high cosine similarity in embeddings mean documents are relevant to the query
Do not rely solely on embedding cosine similarity for retrieval. Implement hybrid search \(combining keyword/BM25 and vector search\), cross-encoder reranking, or LLM-based relevance filtering to ensure retrieved documents actually answer the query.
Journey Context:
RAG pipelines often default to vector search based on cosine similarity. Developers assume that if a document is close in vector space, it answers the question. However, embeddings capture topical similarity, not necessarily query relevance. A document discussing the negation of a fact \(e.g., 'The earth is flat'\) will have high cosine similarity to a query about that fact \('Is the earth round?'\). This leads to retrieving contextually similar but factually opposing or irrelevant documents, confusing the generator.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:51:55.560101+00:00— report_created — created