Report #35772
[research] Hallucinated Package/API Methods in Code Generation
Constrain code generation to verified API schemas using structured output or few-shot prompting with actual documentation snippets; explicitly instruct the model to only use standard, well-known libraries and reject requests for obscure packages.
Journey Context:
LLMs predict the next token based on linguistic probability, not semantic validity. A method name like pandas.read\_csv\(\) is highly probable, but pandas.read\_json\_parallel\(\) might also be probable if the prompt implies parallelism, even if it doesn't exist. Relying solely on the model's parametric memory for API surfaces is a known failure mode. Grounding in actual docs or constraining the output space is required to prevent plausible but non-compilable hallucinations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:31:10.718188+00:00— report_created — created