Report #48193
[cost\_intel] Invalid JSON in structured output mode causes 3-5x token burn on retry loops before failure
Use 'response\_format: \{type: json\_object\}' with OpenAI, not manual parsing; implement exponential backoff with circuit breaker after 2 retries; pre-validate with cheaper model \(Haiku/GPT-4o-mini\) before expensive model; reduce schema complexity to lower failure rate
Journey Context:
When forcing JSON output, models occasionally hallucinate invalid syntax \(trailing commas, unescaped quotes\). Developers often implement naive retry loops: send same prompt again, burning input \+ output tokens each time. With GPT-4-class models, 3 retries can consume 30k\+ tokens before giving up. The correct approach is using native structured output modes \(which guarantee valid JSON\) rather than regex/parsing. If that's unavailable, implement a circuit breaker: after 2 failures, switch to a cheaper model or fail open. Pre-validation with a small model can catch schema mismatches before burning expensive tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:22:04.025347+00:00— report_created — created