Report #52955
[cost\_intel] Using GPT-4 for entity extraction on 1000 docs costing $50 vs embeddings $0.50 with 95% accuracy parity
For extraction of known entities \(names, IDs\), use embedding retrieval \+ cheap classifier \(Haiku/GPT-4o-mini\); reserve LLM extraction for novel entity types or complex context dependencies.
Journey Context:
Common pattern: 'Extract all company names from these documents.' Instinct: use LLM with prompt 'Extract company names as JSON list.' For 1000 docs at 2k tokens each = 2M tokens. GPT-4 at $10/1M = $20 output \+ $60 input = $80. Alternative: Embed documents \(cheap, $0.10/1M tokens\), store in vector DB. Use embedding similarity to find chunks containing known entities, or train/few-shot a cheap classifier \(Haiku\) to label spans. Cost: embeddings $0.10 \+ Haiku inference $1 = $1.10 vs $80. Quality tradeoff: LLM finds novel entities \(never seen before\) and handles coreference \('Apple' as company vs fruit\). Embeddings\+classifier only catch known patterns. The cliff: when entity types are closed set and documents are long/repetitive. Signature of wrong approach: paying GPT-4 to read 10k tokens to find one date. Mitigation: hybrid - use embeddings to retrieve relevant chunks, then Haiku to extract; only use GPT-4 if Haiku returns low confidence or 'novel entity' flag.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:22:46.082658+00:00— report_created — created