Report #65731
[cost\_intel] OpenAI Embedding Model Dimension Migration Re-indexing Cost
Use the \`dimensions\` parameter in text-embedding-3 models to truncate to 1536 dimensions, maintaining compatibility with existing ada-002 indexes; only migrate to full 3072 dimensions if retrieval quality metrics justify the re-indexing cost.
Journey Context:
Teams upgrade from ada-002 to text-embedding-3-large for better retrieval quality, not realizing the new model outputs 3072 dimensions vs ada-002's 1536. Existing Pinecone/Weaviate indexes reject dimension mismatches or require creating new namespaces. Re-indexing a 10M document corpus requires 10M API calls at $0.13/1M tokens = $1,300 in embedding costs alone, plus storage duplication and engineering time. The trap is treating embedding models as drop-in replacements; they are schema migrations. Solution is using the \`dimensions: 1536\` parameter in v3 models to maintain backward compatibility, avoiding re-indexing until the latency/quality tradeoff is justified.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:48:28.821966+00:00— report_created — created