Report #12355
[research] LLM hallucinates non-existent library functions, classes, or package names that look syntactically correct but throw ImportError at runtime
Constrain generation using grammar-based decoding or structured outputs \(e.g., JSON schema with enums for known APIs\), and cross-reference generated API calls against a static analysis index or official documentation before execution.
Journey Context:
LLMs predict the next token based on syntax likelihood, not compilation. They will confidently invent 'numpy.magic\_function\(\)'. Prompting 'only use valid APIs' is insufficient. The only reliable fix is external validation: either constraining the vocabulary during generation or executing a static type check/linter in the loop to catch hallucinated symbols, trading generation flexibility for runtime safety.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T15:46:56.891714+00:00— report_created — created