Report #91267
[cost\_intel] Structured Output Failed Validation Full Regeneration Cost
Implement client-side JSON schema validation before API call to catch impossible constraints; use 'strict' mode with constrained grammars to reduce hallucination; hard-limit retries to 1 with exponential backoff
Journey Context:
When using Structured Outputs \(JSON mode\), if the model generates invalid JSON or schema violations, the standard retry pattern is to resend the entire conversation plus the failed attempt to the model. This triples the cost: original input \+ failed output \(billed\), retry input \(larger context\) \+ new output. A 10k token request that fails twice consumes 30k\+ tokens. Users assume 'retry' is free like HTTP retry, but every attempt generates tokens. The trap is deepest with complex nested schemas \(objects in arrays\) where validation fails frequently. The fix is strict schema design and client-side pre-validation to eliminate impossible combinations before calling the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:47:10.025316+00:00— report_created — created