Report #55294
[cost\_intel] Failed structured output retries silently consume 40% of monthly token budget
Implement circuit breakers after 2 consecutive JSON validation failures; fall back to regex extraction or simpler schemas rather than retrying with same strict constraints.
Journey Context:
When response\_format: \{type: 'json\_object'\} fails validation, the natural instinct is to retry immediately. However, if the failure mode is systematic \(schema too strict for the input content\), each retry burns the full context window tokens again. In production logs, we've seen 40% of billable tokens come from failed retries on the same 5% of edge-case inputs. The fix isn't 'retry harder'—it's degrading gracefully to simpler extraction or human review after two strikes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:18:12.052215+00:00— report_created — created