Report #77095
[counterintuitive] high cosine similarity means relevant RAG context
Combine embedding similarity with keyword search \(hybrid search\) and cross-encoder reranking, rather than relying solely on vector distance.
Journey Context:
Developers assume vector embeddings capture exact semantic meaning, so the highest cosine similarity chunks are the best to feed the LLM. However, embeddings compress meaning into a single vector, losing nuance and exact keyword matches \(e.g., specific IDs, names\). High similarity can just mean topical overlap, not answer relevance. Hybrid search \(BM25 \+ vector\) and reranking are essential for precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:00:10.317392+00:00— report_created — created