Report #96734
[cost\_intel] Truncating OpenAI embeddings to 256 dimensions saves storage but costs 40% accuracy on technical retrieval
Use Matryoshka models \(text-embedding-3\+\) and validate recall@10 on your corpus before truncating; default to full 3072 dimensions for heterogeneous data and only truncate for simple classification
Journey Context:
OpenAI's text-embedding-3 models support Matryoshka Representation Learning \(MRL\), allowing you to truncate the vector to 256, 512, etc. dimensions to save storage and memory. While this works well for simple semantic similarity, on technical documents with fine distinctions \(e.g., differentiating between Python 2.7 and 3.8 code examples\), aggressive truncation loses critical discriminative features, causing retrieval to fail silently. The storage savings are often negated by needing to re-rank with a larger model or by missed retrievals requiring human review.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:57:13.982218+00:00— report_created — created