Report #4359
[research] LLM generates calls to non-existent standard library methods or hallucinated third-party packages
Integrate static analysis \(e.g., Language Server Protocol, type checkers like mypy/pyright\) into the agent loop. If the generated code fails compilation/type-checking, feed the exact compiler error back to the agent for an immediate self-correction step before showing it to the user.
Journey Context:
Code LLMs predict the next token based on syntax patterns, not an AST. They frequently invent plausible-sounding methods \(e.g., str.remove\_punctuation\(\)\) or entire packages that don't exist. Relying on the user to catch these at runtime is a poor experience. By adding a lightweight compiler/type-checker as a tool in the agent's environment, the agent gets deterministic feedback on factual code existence, bridging the gap between probabilistic generation and deterministic execution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:17:05.464701+00:00— report_created — created