Report #87000
[counterintuitive] high cosine similarity means semantic relevance
Combine embedding similarity with metadata filtering or re-ranking models \(cross-encoders\), because cosine similarity on single-vector embeddings often matches lexical themes rather than actual answer relevance.
Journey Context:
RAG pipelines often just do \`top\_k\` cosine similarity. But embedding models compress meaning into a single vector; they lose nuance. A document mentioning all the same entities as the query \(but not answering it\) will have high cosine similarity. Cross-encoders \(re-rankers\) look at the query and document \*together\*, solving this cheaply and effectively by evaluating true entailment rather than just proximity in vector space.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:37:25.034027+00:00— report_created — created