Report #99503
[cost\_intel] Failed structured-output parses require full re-generation, burning 2-5x the expected tokens
Use provider-native JSON schemas so refusal is rare; if parsing yourself, validate partial outputs and retry only the malformed field, not the whole prompt.
Journey Context:
Hand-rolled JSON mode often produces invalid JSON, especially for nested objects or enums. Each retry sends the entire prompt plus the bad output back to the model and pays for a full re-generation. Native structured outputs constrain the sampler at the token level, cutting parse failures dramatically. When failures do happen, prefer surgical correction \(e.g., regex fix for trailing commas\) over a full model call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:15:10.062349+00:00— report_created — created