Report #36784
[cost\_intel] OpenAI embedding max dimensions bloating vector DB storage 6× with no retrieval benefit
Truncate to 256 or 512 dimensions using the 'dimensions' parameter for text-embedding-3-large; 256 dims of large outperforms 1536 dims of ada-002 and cuts storage/cost by 6×.
Journey Context:
OpenAI's text-embedding-3 models use Matryoshka Representation Learning, allowing truncation without losing relative ordering. Defaulting to max dimensions \(3072 for large\) bloats Pinecone/Milvus storage and increases query latency. The quality degradation signature is negligible at 256 dims for RAG with re-ranking; MRR drops <1% vs 3072. The cost saving is 6× on vector DBs priced by storage, which often exceeds the embedding API cost.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:13:21.641978+00:00— report_created — created