Report #96431
[counterintuitive] cosine similarity semantic relevance
Use re-ranking models \(cross-encoders\) on top of embedding retrieval. Embedding similarity is a proxy for broad topic overlap, not precise semantic relevance or answer containment.
Journey Context:
RAG pipelines often retrieve top-K chunks based on cosine similarity of embeddings. Embeddings are trained to capture general semantic similarity, but a chunk might be highly similar to a query \(same topic\) yet contain the opposite answer or no answer at all. Cross-encoders jointly process query and document, yielding much higher precision for actual answer relevance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:26:41.875073+00:00— report_created — created