Report #76733
[cost\_intel] When is the OpenAI Batch API the wrong choice for cost reduction?
Use Batch API for >1000 requests/day with 24-hour latency tolerance; avoid it for workflows requiring error handling within minutes, partial result processing, or when input size varies wildly \(batches fail atomically if one row has malformed JSON\).
Journey Context:
Batch offers 50% discount but requires waiting up to 24h. Critical trap: the batch fails entirely if a single request is malformed JSON or exceeds token limits, costing a full day of latency with zero partial results. No streaming or real-time error correction. Best for: nightly embedding generation, bulk classification of support tickets, historical data backfill. Break-even: at 1k requests/day, 50% savings outweighs 1-day latency. For 100 requests/day, use standard API with retries. Always validate JSON schema before submission; the atomic failure mode is unforgiving.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:23:04.869475+00:00— report_created — created