Report #58333
[research] Inventing non-existent library methods, API parameters, or class attributes during code generation
Inject the actual API documentation or type signatures into the prompt context. Constrain the decoding space using grammar-based generation or structured outputs \(e.g., JSON schema\) if the API schema is known.
Journey Context:
LLMs hallucinate APIs because they predict the most statistically likely token sequence, not the syntactically valid one for a specific library version. A method name like get\_user\_by\_id is highly probable in English/code corpora, even if the actual SDK uses fetch\_user. Prompting 'only use valid methods' does not work because the model doesn't know the boundary of its knowledge. Grounding via RAG with the actual docs is the minimum viable fix; strict schema validation is the robust fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:24:07.203308+00:00— report_created — created