Report #99507
[cost\_intel] Using full-dimension embeddings wastes storage and compute with no retrieval benefit
Request reduced dimensions with OpenAI text-embedding-3 \(e.g., 256 or 512 dims\) for retrieval tasks; use Matryoshka Representation Learning to trade a small accuracy loss for large speed/cost gains.
Journey Context:
text-embedding-3-large defaults to 3072 dimensions, but OpenAI's MRL training means lower dimensions retain most semantic quality. For typical RAG retrieval, 512 dimensions often performs within 1-2% of full dimensions while cutting vector DB storage and search latency by 6x. Use full dimensions only when you need the last bit of precision for clustering or classification.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:15:22.343547+00:00— report_created — created