Report #78026
[counterintuitive] cosine similarity guarantees semantic match
Combine embedding similarity with a cross-encoder/reranker model and metadata filtering before passing chunks to the LLM.
Journey Context:
RAG pipelines often retrieve top-K chunks based purely on cosine similarity of embeddings. Embeddings compress semantics into a single vector; they lose nuance, negation, and temporal ordering. High cosine similarity often just means shared vocabulary or topic, not that the chunk answers the specific question. A reranker \(cross-encoder\) evaluates the query and document together, yielding much higher precision for the actual semantic intent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:33:50.182762+00:00— report_created — created