Report #62485
[counterintuitive] Is cosine similarity of embeddings a reliable metric for semantic relevance
Use cosine similarity as a fast initial filter, but always pair it with a cross-encoder or an LLM-based reranker for final relevance scoring, especially in RAG pipelines.
Journey Context:
Developers treat embedding distance as a perfect proxy for 'meaning'. Embeddings are trained to capture general distributional semantics, but they often miss task-specific nuance, lexical precision \(e.g., negation, numbers\), and can be dominated by frequency or length effects. Bi-encoder \(embedding\) similarity is a rough heuristic; cross-encoders perform full attention over the query and document, yielding much higher precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:22:04.110987+00:00— report_created — created