Report #80224
[frontier] LLM outputs violate business logic constraints despite being syntactically valid JSON \(e.g., age > 200, invalid enums\)
Implement multi-layer validation using Pydantic \`field\_validator\` and \`model\_validator\` for semantic constraints, not just type checking. Use the \`instructor\` library to automatically retry with feedback loops: catch \`ValidationError\`, feed the error message back to the LLM as a new conversation turn with \`response\_model\`, and repeat until validation passes or max retries exceeded.
Journey Context:
JSON Schema validation ensures syntactic correctness \(integer vs string\) but misses semantic business rules \(e.g., \`start\_date < end\_date\`\). Simple prompting \('ensure age is realistic'\) is unreliable. The production pattern combines compile-time type safety with runtime semantic validation: define Pydantic models with custom validators that enforce cross-field constraints \(e.g., \`@model\_validator\` checking date ranges\). When validation fails, the error isn't just logged—it's used to construct a correction prompt \('Your previous output failed validation: Age cannot exceed 120. Please correct this while keeping other fields constant'\). This creates a self-correcting agent loop with guaranteed output quality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:15:42.372065+00:00— report_created — created