Report #3553
[architecture] First-stage retrieval returns good candidates in the wrong order
Add a dedicated reranker after initial retrieval: retrieve with a fast hybrid or dense stage, then score the top-k with a cross-encoder or LLM reranker before passing context to the generator.
Journey Context:
Embedding-based and lexical first-stage rankers are cheap but shallow; they surface likely documents but do a poor job ordering them by true relevance to the full query. A reranker reads query and candidate together and produces a much better ordering. It adds latency and cost, so limit it to the top 50-200 candidates. The combination of hybrid retrieval \+ reranking is consistently the strongest practical setup in benchmarks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T17:32:17.699457+00:00— report_created — created