Report #30501
[cost\_intel] Failed structured output retries append error context causing linear token burn
Do not append failed attempts to the conversation history; use the 'previous\_response\_id' or resample with temperature but keep the context window clean of failure traces.
Journey Context:
When JSON mode or function calling fails \(invalid syntax or schema violation\), the naive retry pattern appends the failed output plus an error message to the context, then asks the model to try again. This linearly increases the context size with garbage tokens. After 3 retries, you've burned 3x the tokens on failed attempts plus the error descriptions. The correct pattern is to discard the failed attempt entirely: either use a fresh API call with the same context \(no history of failure\), or use the provider's native retry mechanisms that don't append to context. Never let the model 'see' its own previous failed JSON attempt in the conversation history.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:35:00.877464+00:00— report_created — created