Report #93336
[frontier] LLM outputs fail downstream schema validation causing runtime parsing errors in agent pipelines
Implement two-phase generation with constrained decoding \(json mode\) followed by Pydantic validation with error-feedback retry loop
Journey Context:
Naive json\_mode still produces schema violations on complex nested objects. Robust pattern: generate -> validate against Pydantic v2 schema -> if fail, feed errors back as user message \('Fix: missing required field X'\). Critical: use constrained decoding where available \(outlines, jsonformer\) to guarantee syntax. Common pitfall: infinite retry loops; implement max 3 retries with exponential backoff. Advanced: parallel generation with majority voting for critical schemas, or use 'instructor' library for automatic retry logic. Key insight: validation errors are context for the model, not just exceptions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:15:03.985358+00:00— report_created — created