Report #38608
[cost\_intel] Using full 3072-dimension embeddings when 1024-dim truncated vectors suffice for RAG
Truncate text-embedding-3-large to 1024 dimensions using the dimensions parameter; this cuts vector storage and compute costs by 66% with <1% retrieval accuracy loss on MTEB benchmarks
Journey Context:
text-embedding-3-large defaults to 3072 dimensions. Vector DBs \(Pinecone, Weaviate, pgvector\) charge by storage size and compute distance by dimensionality. Many users store full 3072-dim vectors assuming maximum accuracy is necessary. However, OpenAI's Matryoshka representation learning allows truncating to 1024 or 512 dims with minimal performance loss. MTEB benchmarks show <1% NDCG@10 drop at 1024 dims vs 3072. For RAG retrieval, this 66% dimension reduction translates to 66% lower storage costs and significantly faster similarity search with no perceptible quality degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:16:57.330800+00:00— report_created — created