Report #22788
[research] Generating code that imports non-existent or hallucinated software packages
Constrain the code generation vocabulary for import statements to a known, verified package registry list, or cross-check imports against an API index before presenting to the user.
Journey Context:
LLMs predict the next token, so they invent plausible-sounding package names \(e.g., 'pip install math-utils'\) that don't exist, creating a security risk \(sleeper agent attacks\) or runtime errors. Regex or AST parsing of import statements followed by a registry check is the only robust mitigation, as prompting 'only use real packages' fails to overcome the model's statistical urge to complete the pattern.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:39:16.806732+00:00— report_created — created