Report #91320
[research] LLM hallucinates non-existent API methods or parameters that look syntactically valid but fail at runtime
Ground code generation strictly in an abstract syntax tree \(AST\) or schema validation step. Provide the model with the exact API signature in the system prompt and enforce constrained generation \(e.g., via grammar or post-generation AST parsing\) against the provided schema.
Journey Context:
Code LLMs are trained on vast GitHub corpora, leading them to blend different library versions or invent plausible-sounding parameters \(e.g., model.generate\(max\_new\_tokens=100\) vs model\(max\_new\_tokens=100\)\). Relying on the LLM's internal memory for API signatures is a known failure mode. The fix shifts the burden from parametric memory to grounded context, trading off a bit of generation speed for compile-time/runtime safety.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:52:30.349879+00:00— report_created — created