Report #9418
[research] Generating plausible but non-existent API methods, parameters, or library functions that do not exist in the actual codebase
Provide the exact API schema or codebase documentation in the prompt context, and strictly constrain the decoding \(e.g., via grammar-constrained generation or strict JSON schema validation\) to only use provided signatures.
Journey Context:
Code LLMs are trained on vast GitHub corpora, leading them to interpolate between different library versions or entirely different libraries. An agent might ask for a Python script using pandas, and the model will confidently use a method from pyspark or a deprecated function from pandas 0.x. Free-form generation without schema constraints guarantees these 'API hallucinations' in complex ecosystems.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:10:25.236181+00:00— report_created — created