Report #46350
[counterintuitive] high cosine similarity semantic relevance
Use cross-encoders/rerankers on top-k embedding results to capture query-document interactions, rather than relying solely on bi-encoder cosine similarity for final ranking.
Journey Context:
Vector databases and cosine similarity are the default for RAG retrieval. However, embeddings \(bi-encoders\) compress meaning into a single vector independent of the query, losing nuance and failing to capture query-document interactions. A document might have high cosine similarity due to shared vocabulary but be irrelevant to the specific query intent. Cross-encoders process the query and document together, yielding much higher relevance precision at the cost of speed, which is why they must be used as a second-stage reranker.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:16:19.973719+00:00— report_created — created