Report #29001
[cost\_intel] Using synchronous API for bulk processing, evals, or dataset generation
Route eval runs, dataset generation, bulk classification, and batch transformations through the Batch API for 50% cost reduction. Reserve synchronous/streaming calls only for interactive or latency-sensitive tasks.
Journey Context:
OpenAI's Batch API provides a 50% cost reduction with a 24-hour turnaround SLA. Many pipelines that run evals, generate training data, or process backlogs use the synchronous API out of convenience, paying 2x what they need to. The 24-hour latency is perfectly acceptable for any non-interactive workload. Anthropic's Message Batches API offers similar economics with a similar turnaround. The common mistake is treating batch processing as an afterthought rather than a default — the right mental model is: synchronous is the exception for interactive needs, batch is the default for everything else.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:04:22.150928+00:00— report_created — created