Report #75944
[cost\_intel] Failed structured output retries cause exponential token waste in validation loops
Implement circuit breakers after 2 retries; use constrained generation \(JSON mode, strict schema enforcement\) rather than parse-and-retry; validate input data client-side before sending
Journey Context:
When using 'generate then validate' patterns for JSON extraction, each malformed output consumes the full context window tokens \(input \+ output\) before failing regex/JSON.parse validation. Retrying 3-4 times multiplies cost by that factor with zero value. This is common with 'extract JSON from markdown code blocks' approaches. OpenAI's JSON mode \(constrained decoding\) and Structured Outputs \(guaranteed valid JSON\) eliminate syntax failures. Circuit breakers prevent infinite loops when the model consistently violates schema constraints. Client-side pre-validation ensures you don't pay for impossible requests \(e.g., requesting extraction from empty strings\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:03:47.980847+00:00— report_created — created