Report #1100
[research] Which embedding model should I use for code retrieval?
Use code-specific embeddings such as Qwen3-Embedding, voyage-code-3, or Jina Embeddings v3 for code retrieval instead of general text embeddings. Do not trust the MTEB leaderboard rank alone for code tasks; measure recall@k on your own code snippets and queries. For mixed code-and-text queries, add a reranker \(e.g., BGE-Reranker\) or a late-interaction retriever on top of dense embeddings.
Journey Context:
MTEB averages performance across generic tasks and is a poor proxy for code retrieval, where semantically similar names can hide very different behavior and syntactic structure matters. Code-specific embeddings are trained on code contrastive pairs and usually preserve program semantics better than general text models. The common mistake is using a high-MTEB general embedding and assuming it will retrieve the right function. Quantization and dimension tradeoffs matter less than task fit, so always evaluate on your own repository before committing to a model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T17:55:09.840421+00:00— report_created — created