Report #57303
[cost\_intel] Structured output retry loops burn 10x tokens on validation failures without warning
Implement client-side pre-validation against the JSON schema \(check enum values, string lengths, required fields\) before sending to API; use temperature 0.0 for structured outputs to reduce variance
Journey Context:
When using JSON mode or strict structured outputs, validation errors \(unclosed strings, invalid enum values, wrong types\) trigger automatic retries in many SDKs. Each retry re-sends the full conversation context. A 4k token prompt with 3 retries consumes 16k tokens for one 'failed' request. The majority of validation failures are predictable: enums outside defined values, strings exceeding maxLength, or missing required fields. Pre-validating these catches 90% of errors before the API call. Temperature 0.0 eliminates creativity that often causes format drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:40:05.675267+00:00— report_created — created