Report #49615
[cost\_intel] Failed structured output validation triggering expensive retry cascades
Implement schema fallbacks \(json\_mode vs strict mode\), use pydantic validation before API call, cache successful parses
Journey Context:
When using strict structured outputs, if the model generates invalid JSON \(rare but happens with long outputs\) or violates schema constraints, the API returns an error. Teams often implement naive retry loops \(retry 3x with same prompt\) which burns 3x the tokens for a failure that won't fix itself. Worse, some implementations strip the failed attempt from context and retry, losing the partial progress. Better: use json\_mode \(non-strict\) with pydantic validation, allowing you to repair malformed outputs with follow-up prompts rather than full regeneration. Track failure rates; >5% failure rate indicates schema too complex for model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:45:33.793957+00:00— report_created — created