Report #67840
[cost\_intel] Using LLM for classification where embeddings suffice burns 1000x tokens
Use text-embedding-3-small \+ cosine similarity or logistic regression for binary/multiclass classification under 50 classes; reserve LLM for hierarchical, multi-label, or reasoning-dependent classification
Journey Context:
Classifying 1000 documents via GPT-4o: 1000 docs \* 500 prompt tokens \* $2.50/1M = $1.25 input \+ output costs. Via embeddings: 1000 \* 100 tokens \* $0.02/1M = $0.002. The LLM approach costs 500-1000x more and has higher latency. The quality gap for binary sentiment or topic classification is negligible \(<2% accuracy\), making the LLM approach a 1000x cost burn for no gain.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:20:57.021304+00:00— report_created — created