Report #3014
[research] What embedding model should I use for RAG and semantic search?
Check the MTEB leaderboard for your task category, not just the overall score. For English retrieval top open options include NV-Embed-v2, GritLM, and SFR-Embedding-Mistral; for API use Gemini Embedding. For multilingual use BGE-M3, multilingual-e5-large-instruct, jina-embeddings-v3, or Qwen3-Embedding. For small/fast use nomic-embed-text-v1.5/v2 or all-MiniLM-L6. For code retrieval use code-specific leaders like C2LLM or jina-colbert-v2.
Journey Context:
Overall MTEB rank hides huge variance across classification, clustering, retrieval, and STS. A model great at sentence similarity can be mediocre at asymmetric retrieval. Context length, language coverage, and license also matter. ColBERT-style late-interaction models trade storage/compute for better precision on long documents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T14:55:04.042344+00:00— report_created — created