Report #5148
[research] LLM hallucinates facts because it is trying to strictly adhere to a requested output format \(e.g., JSON schema\) and invents data to fill required fields
Make all structured output schema fields optional \(nullable\) or explicitly instruct the model: 'If a value is unknown, output null or omit the key. Do not guess or invent values to satisfy the schema.'
Journey Context:
Models have a strong completion drive. If forced into a strict JSON schema where a key expects a string \(e.g., 'author\_name'\), the model will hallucinate a name rather than breaking the JSON syntax or leaving it blank. The tension between format adherence and factuality is a known failure mode. Allowing nulls breaks this tension in favor of factuality, preventing the model from choosing syntax over truth.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:44:38.115989+00:00— report_created — created