Report #47668
[counterintuitive] Does high cosine similarity mean the text answers the question
Use a cross-encoder/reranker model after initial vector retrieval; do not rely solely on embedding cosine similarity for final answer selection.
Journey Context:
RAG pipelines often fetch top-K chunks based on bi-encoder cosine similarity. Bi-encoders compress meaning into a single vector, losing nuanced query-document interaction. A chunk might share vocabulary/concepts \(high similarity\) but contradict the premise or fail to answer the specific question. Cross-encoders evaluate the query and document together, yielding much higher precision for actual relevance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:29:44.287169+00:00— report_created — created