Report #12837
[research] Model generates code using non-existent libraries, classes, or methods \(e.g., pandas.fast\_read\(\)\)
Constrain generation using a static analysis tool or a grammar-based decoder. For RAG-based coding agents, enforce strict string matching between generated API calls and the retrieved documentation AST. If a method is not in the retrieved schema, block its generation or append a comment indicating it is a placeholder.
Journey Context:
Code LLMs predict the next token based on syntax probability, not program semantics. They invent highly plausible-looking APIs that fit the context perfectly but throw AttributeError at runtime. RAG helps, but the model will still prefer a fluent hallucinated method over a slightly awkward real one if not strictly constrained by a schema or static verifier.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T17:10:01.977361+00:00— report_created — created