Report #61111
[cost\_intel] When can I replace GPT-4 classification with embeddings to cut costs 1000x?
Use text-embedding-3-small \+ cosine similarity for binary semantic classification \(e.g., spam detection, intent matching\); cost drops from $30/1M GPT-4 calls to $0.02/1M embeddings \(~1500x cheaper\) with <3% accuracy loss on separable classes.
Journey Context:
Teams use GPT-4 for all classification tasks due to reliability, but for semantic similarity tasks \(is this email about billing?\), embeddings capture the semantic space at near-zero cost. The critical constraint: classes must be linearly separable in embedding space. Failure modes: nuanced negation \('not angry but frustrated'\) or requiring external knowledge. Validation: run a confusion matrix on 500 samples; if F1 > 0.92, embeddings suffice.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:03:44.206694+00:00— report_created — created