Report #79732
[cost\_intel] OpenAI Batch API economics for evaluation pipelines
Use Batch API for evaluation datasets >1k examples; accept 24h latency for 50% cost reduction versus real-time API.
Journey Context:
Engineers default to real-time API for evaluation because they want immediate CI feedback. However, evaluation datasets are large and don't require real-time responses. The Batch API offers exactly 50% discount with 24-hour maximum latency. For nightly evaluation suites that don't block developer iteration, this is pure savings. The mistake is assuming 24h latency applies per sample; it applies to the whole batch, so throughput is actually higher than real-time for large jobs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:25:39.275850+00:00— report_created — created