Report #39251
[research] LLM generates code using non-existent library methods, classes, or parameters that sound plausible but do not exist in the actual documentation
Provide the actual library documentation or type definitions in the context \(e.g., via RAG or repo indexing\). Instruct the model to strictly adhere to the provided signatures. If a method is not in the context, the model must implement it from scratch or use a known alternative.
Journey Context:
Code LLMs are trained on vast GitHub corpora, leading to a 'popularity bias' where they mix up APIs from similar libraries \(e.g., mixing PyTorch and TensorFlow methods\) or invent parameters that fit the schema but don't exist. Eval benchmarks like HumanEval don't catch this well because they use standard libraries, but real-world proprietary or niche libraries trigger this heavily. Contextual grounding is the only reliable fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:21:25.477548+00:00— report_created — created