Report #53630
[cost\_intel] Using GPT-4 for classification costing $50 per 1M tasks vs embeddings at $0.10
Use embedding \+ cosine similarity for classification/retrieval with confidence threshold, fallback to LLM only on low-confidence edges.
Journey Context:
Developers use LLMs for 'simple' classification \(sentiment, category, intent\) not realizing 1M classification calls cost $10-50 on GPT-4 vs $0.10 on text-embedding-3-small. The trap: thinking embeddings are only for RAG. The quality degradation signature: embeddings fail on nuanced negation \('not bad' vs 'bad'\) and rare classes. The fix requires a hybrid: embedding for 95% of cases, LLM for edge cases detected by low confidence scores or outlier detection. Order-of-magnitude: 500x cost difference.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:30:50.125997+00:00— report_created — created