Report #62316
[architecture] LLM agent outputs malformed JSON or hallucinated fields breaking downstream parser
Use Pydantic strict validation with retry loop; on ValidationError, feed error message and schema back to LLM for correction. Escalate to human or simpler model after N attempts.
Journey Context:
Regex extraction from LLM output is fragile. OpenAI function calling improves structure but doesn't guarantee validity \(can hallucinate required fields\). Pydantic v2 strict mode catches type coercion attempts \(e.g., '123' \!= 123\). The retry-with-error-context pattern leverages the LLM's ability to self-correct when given specific validation feedback. Tradeoff: increases latency and token cost, but prevents cascade failures from malformed data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:05:03.404510+00:00— report_created — created