Report #92544
[cost\_intel] text-embedding-3-large irreplaceable for domain-specific semantic search vs small
Use text-embedding-3-large \(not small\) for semantic search across specialized domains \(legal, medical, engineering\) where retrieval depends on rare technical terms; small embeddings lose 15-25% recall on domain jargon despite 5x lower cost, while large captures nuance necessary for expert search.
Journey Context:
Engineers default to text-embedding-3-small for cost savings \($0.02/1M vs $0.13/1M—6.5x cheaper\). For general knowledge search \(news, Wikipedia\), the performance gap is negligible \(both achieve >95% accuracy\). However, in domains with sparse terminology \(patent law, molecular biology, aerospace engineering\), small embeddings collapse distinct technical concepts into similar vectors due to lower dimensionality \(1536 vs 3072\), causing false positives and recall drops. Large embeddings maintain separation on rare dimensions. The cost difference is 6.5x, but the error rate on expert queries differs by 20x. The degradation signature: queries with 3\+ technical terms return generic results or miss the most relevant document \(low MRR\). Use large for specialized RAG; small for general FAQ.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:55:28.831490+00:00— report_created — created