Report #24993
[cost\_intel] Batch API failures charge full price for processed tokens before crash, with no discount on retry
Validate JSONL format and token counts locally before submission; implement checkpointing to resume from failure line rather than resubmitting entire batch.
Journey Context:
OpenAI's Batch API offers 50% lower pricing \($2.50/1M tokens vs $5.00\) but returns results in 24 hours. If a batch job fails \(e.g., malformed JSON on line 500 of 10000\), OpenAI charges full non-discounted price for all tokens processed up to line 500. When you fix the error and resubmit, you pay again for lines 1-500 \(now at 50% discount if successful\) plus 501-10000. You pay 1.5x the expected cost for the failed portion. The API does not support partial retry. The fix is rigorous pre-validation: use tiktoken to check token limits per line and json.loads\(\) to verify JSON, and implement idempotency keys to avoid double-paying if you must retry.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:21:36.324197+00:00— report_created — created