Report #97536
[cost\_intel] Batch API 50% discount looks cheaper but total job cost exceeds synchronous due to failures and re-runs
Reserve Batch API for offline, idempotent, fault-tolerant work; validate the input JSONL before upload; design for out-of-order output keyed by custom\_id; and budget for re-submission of expired or failed lines rather than treating the 50% discount as guaranteed savings.
Journey Context:
Batch pricing is half of synchronous and uses a separate rate-limit pool, but the 24-hour completion window can expire, validation can reject the file, and individual lines can fail. Workflows that need partial results quickly or cannot tolerate next-day latency often re-run failed portions synchronously, wiping out the discount. The biggest waste comes from using Batch just because it is cheaper, without confirming the workload truly tolerates batch semantics.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:17:07.893900+00:00— report_created — created