Report #74673
[cost\_intel] JSON mode validation failures trigger exponential token burn through blind retry loops
Implement 'graceful degradation parsing'—on schema failure, extract valid JSON subset via regex then pad with defaults rather than re-requesting; cap retries at 1 with temperature=0.1 for recovery attempts only
Journey Context:
When GPT-4 Turbo JSON mode returns malformed JSON \(trailing commas, unescaped quotes\), the common anti-pattern is to retry with 'be more careful' prompting. Each retry consumes the full context window again—on 8k context that's $0.24 per retry. With 3 retries you're at $0.72 for a single failed extraction. Worse: temperature>0 increases variance, making subsequent failures more likely. The correct pattern is to accept 'good enough' JSON—use recursive descent parsing to fix common errors locally \(add missing braces, escape quotes\) rather than round-tripping to API. This cuts cost by 90% on edge cases. The signature of this trap is seeing retry loops in logs with identical prompt hashes but escalating costs—the agent is burning tokens on deterministic errors that regex could fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:56:09.129351+00:00— report_created — created