Report #95760
[cost\_intel] Structured output validation failures burn 2-3x tokens on retry cascades
Implement client-side JSON Schema pre-validation; use 'strict':true to force grammar compliance on first generation
Journey Context:
When strict JSON mode or structured output fails schema validation, the entire prompt plus the invalid JSON output is resent to the model with additional correction instructions. This consumes 2x-3x the original token count with zero caching of the failed attempt. Complex nested schemas \(objects within arrays\) have 15-20% failure rates on smaller models, turning 'guaranteed structured output' into a cost trap. The strict:true parameter forces the model to follow a constrained grammar at the token sampling level, reducing validation failures from 15% to <2%, paying for itself immediately.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:18:58.238210+00:00— report_created — created