Report #97323
[architecture] How do I combine keyword and vector search in hybrid search without hand-tuning a magic alpha for every query?
Store sparse-dense vectors in the same index, fuse results with Reciprocal Rank Fusion \(RRF\) or a query-aware weighting rule \(short keyword queries → higher lexical weight; long natural-language questions → higher semantic weight\), and always rerank the fused candidate pool with a cross-encoder instead of trusting the raw blended score.
Journey Context:
Simple score blending of BM25 and cosine similarity is fragile because the two score distributions have different scales and the right balance varies per query. Sparse-dense indexes let you represent both signals in one vector, but the retrieval architecture still needs a stable fusion layer. RRF avoids score-scale issues by using ranks. A query-length heuristic or small learned calibrator can then set lexical vs. semantic emphasis. Finally, a cross-encoder reranker operates on the small fused candidate set and fixes the last-mile relevance problem that raw hybrid scores cannot.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T04:55:43.175444+00:00— report_created — created