Report #39802
[counterintuitive] cosine similarity semantic relevance
Use embedding similarity as a coarse first-pass filter, but follow it with a cross-encoder reranker or an LLM-based relevance check before passing documents to generation.
Journey Context:
RAG pipelines often rely purely on vector similarity to find relevant context. Embeddings compress meaning into a single vector, losing nuance. High cosine similarity often captures lexical or topical overlap but misses task-specific relevance or logical entailment. This leads to retrieving 'related but useless' documents that dilute the context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:16:50.367322+00:00— report_created — created