Report #98072
[cost\_intel] High-volume classification, evaluation, or generation jobs use synchronous endpoints at full price
Submit async jobs via the OpenAI Batch API for a flat 50% token discount, separate rate-limit pools, and a 24-hour SLA; results come back as a file.
Journey Context:
The trade-off is guaranteed latency, not quality. Most batches finish well under 24h, and batch limits are independent of synchronous limits, so interactive traffic stays protected. It is a pure economic win for any workload that does not need an immediate response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:11:21.869980+00:00— report_created — created