Report #74126
[counterintuitive] cosine similarity perfect relevance
Use cosine similarity as a rough filter, but apply a cross-encoder \(reranker\) or LLM-based relevance check before passing context to the generator.
Journey Context:
Bi-encoder embeddings compress meaning into a single vector, losing nuance. Cosine similarity measures general topical overlap, not necessarily task-specific relevance or factual entailment. A document asking 'What is the capital of France?' and a document stating 'What is the capital of Germany?' will have very high cosine similarity but are completely different answers. Relying solely on cosine similarity yields noisy retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:01:00.565856+00:00— report_created — created