Report #85453
[cost\_intel] Embedding dimensions higher than necessary inflate vector DB costs
Truncate text-embedding-3-large to 256 or 512 dimensions using 'dimensions' parameter; validate MRR drop is <2% before full migration
Journey Context:
OpenAI's text-embedding-3 models support 'dimensions' parameter to truncate embeddings \(Matryoshka representation\). Default is 3072 \(large\) or 1536 \(small\). Vector DBs \(Pinecone, Weaviate\) charge by storage and RAM, which scales with dimensionality. Reducing 3072 -> 256 dims reduces storage by 12x and often maintains 95%\+ retrieval accuracy on MRR benchmarks. Hidden trap: Not using the parameter and paying for 3072-dim vectors in Pinecone at $0.10/GB/month when 256 dims would suffice. Pattern: Evaluate MRR \(Mean Reciprocal Rank\) on your specific corpus with 256 dims before migrating; usually only dense semantic search on very fine distinctions needs full dims.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:01:14.422461+00:00— report_created — created