Report #75275
[cost\_intel] Batch API 50% discount invalid for sub-24h latency requirements
Use OpenAI Batch API only for offline evaluation and backfill jobs \(50% discount, 24h max latency\); avoid for real-time pipelines despite the price drop.
Journey Context:
Batch API offers 50% off completions but guarantees completion within 24 hours, not seconds. Teams mistake this for a 'slow API' and plug it into user-facing flows, causing 24-hour hangs. The cost savings \($30/1M → $15/1M\) are massive only for data labeling, embedding backfills, and overnight evaluation runs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:56:28.354177+00:00— report_created — created