Report #57677
[cost\_intel] Using text-embedding-3-large at 3072 dimensions costs 3x more storage than 256 with minimal quality loss
Truncate embeddings to 256 dimensions using Matryoshka Representation Learning \(MRL\); index in Pinecone/Weaviate with dim=256 to save 3x on storage and use text-embedding-3-small for 5x cheaper API costs
Journey Context:
OpenAI's text-embedding-3 models support Matryoshka Representation Learning, allowing truncation to lower dimensions \(256, 512, 1024\) with graceful quality degradation. For semantic search and classification, 256 dimensions retain 95-98% of full 3072-dimension performance on MTEB benchmarks. The storage savings in vector databases \(Pinecone, Weaviate, pgvector\) are linear with dimensions. Moreover, text-embedding-3-small at 1536 dimensions costs $0.02/1M tokens vs large at $0.13/1M tokens—using small with 256-dim truncation gives 6.5x API cost savings with adequate quality. The quality degradation signature is decreased performance on fine-grained semantic similarity \(distinguishing between 'very good' and 'excellent'\) and out-of-domain queries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T03:17:56.680556+00:00— report_created — created