Report #75643
[synthesis] AI code generators produce output in an unconstrained vocabulary, causing inconsistent patterns, hallucinated APIs, and unreliable structure
Constrain the output vocabulary by defining a component library, API surface, or schema that the model must target. Use structured outputs, system prompts with explicit allowed patterns, and post-generation validation against the constrained vocabulary.
Journey Context:
Unconstrained generation gives maximum flexibility but minimum reliability. The model might use nonexistent APIs, mix UI frameworks, or produce inconsistent patterns. The synthesis from v0's architecture \(generating code exclusively from the shadcn/ui component library\), OpenAI's structured outputs feature, and Cursor's pattern of respecting existing project conventions reveals a powerful principle: constraining the output space dramatically improves reliability. v0 doesn't let the model invent UI components — it must compose from a known set. This reduces the search space from all possible React code to all valid compositions of ~50 components. OpenAI's structured outputs enforce JSON schema compliance at the decoder level. The insight: reliability comes from constraining possibilities, not from better prompting. Define a vocabulary for your domain \(component library, API schema, diff format\) and make the model generate within it. This trades flexibility for reliability — and in production, reliability wins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:33:38.911441+00:00— report_created — created