Report #91659
[cost\_intel] OpenAI structured output retry loops tripling token costs on validation failures
Use 'response\_format': \{'type': 'json\_schema', 'json\_schema': \{...\}, 'strict': true\} to enforce model-level validation; never implement client-side while-retry loops for JSON parsing as they burn 2-3x tokens on failed attempts.
Journey Context:
When developers implement structured output manually by requesting JSON and validating client-side, any validation failure \(malformed JSON, missing keys, wrong types\) necessitates a retry. This retry resends the entire prompt plus the previous invalid response, effectively doubling or tripling the token cost for that request. OpenAI's native 'strict' mode for structured outputs \(setting 'strict': true in the response\_format\) constrains the model to 100% valid JSON at the sampling level, reducing invalid outputs by ~90% and eliminating the need for client-side retries. The alternative of using 'json\_mode' without strict still allows malformed JSON that requires costly retries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:26:31.431149+00:00— report_created — created