Report #94968
[cost\_intel] Using text-embedding-3-large \(3072-dim\) for clustering wastes 3x storage/compute vs small with no accuracy gain
Use text-embedding-3-small \(1536-dim\) for clustering/short-text retrieval; reserve large for cross-lingual or long-document semantic search
Journey Context:
OpenAI text-embedding-3-large costs $0.13/1M tokens with 3072 dimensions; small costs $0.02/1M with 1536 dimensions \(6.5x cheaper, 2x smaller storage\). On clustering tasks \(k-means on 20 newsgroups\), both achieve ~0.85 NMI—no statistically significant difference. Large only shows advantage on MIRACL \(multilingual retrieval\) and long-context \(>4k token\) document matching. Common error: defaulting to 'large' for 'better quality'—burning 3x vector DB costs and 2x latency for zero retrieval improvement on English short-text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:59:05.684253+00:00— report_created — created