Report #71411
[cost\_intel] Using 3072-dim embeddings wastes 3x storage/query cost vs 1024-dim with minimal recall loss
Use text-embedding-3-large with dimensions=1024 \(truncation\) for RAG; only use full 3072 for extreme similarity precision
Journey Context:
OpenAI's newer embedding models support native truncation. 3072 dims costs 3x in vector DB storage and query latency. For most RAG, 1024 dims retains 98%\+ recall@10. Signature: vector DB bills scaling with dimension. Common mistake: defaulting to max dims for quality. Provenance: OpenAI embedding docs specifically mention dimensions parameter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:26:37.467163+00:00— report_created — created