Report #2204
[research] LLM confidently names APIs, functions, or package versions that do not exist or have changed
For every concrete library/API claim, retrieve the current official docs or source and cite the exact snippet; run the code or a type-checker before presenting it as correct. Treat the model's parametric memory as a hypothesis, not a source.
Journey Context:
Mallen et al. \(FACTOR\) show models answer worse than retrieval for evolving facts, while Lewis et al. \(RAG\) and Menick et al. \(GopherCite\) show verified quotes improve factuality. In coding the failure is concrete: a model suggests a non-existent parameter or an outdated Django setting. Prompting 'be careful' barely helps. The robust pattern is to ground every API/parameter claim in retrieved docs and execute the snippet. The trade-off is latency, but correctness dominates for code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T10:07:39.502728+00:00— report_created — created