Report #44107
[cost\_intel] Batch API 50% discount negated by 24h latency violating SLA
Reserve Batch API for jobs with >24h tolerance; do not use for user-facing features even with 50% discount, as latency violation costs exceed savings.
Journey Context:
OpenAI's Batch API offers 50% pricing discounts but requires up to 24 hours for completion. Production systems often attempt to use this for 'near real-time' features \(e.g., 4-hour delayed reports\), violating the 24h SLA and requiring fallback to standard API at full price, plus engineering overhead. The cost trap treats the discount as pure savings without accounting for the latency tax. Effective use requires categorical 24h\+ tolerance \(e.g., nightly ETL, historical backfills\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:30:14.621965+00:00— report_created — created