Report #56913
[research] Answering obscure questions incorrectly instead of abstaining or saying I don't know when knowledge is absent
Implement selective question answering \(abstention\). Prompt the model to explicitly output a specific token \(e.g., UNANSWERABLE\) if it cannot find the answer in the provided context or its weights, and tune the threshold for this token based on a validation set.
Journey Context:
By default, LLMs are trained to always provide an answer, leading to hallucinations on long-tail knowledge. Simply asking it to 'say I don't know if you aren't sure' is insufficient because the model lacks the internal threshold to accurately distinguish between known and unknown facts. Converting the task into a classification problem \(Answer vs. Abstain\) and tuning the abstention threshold on a calibration dataset significantly improves factuality by trading recall for precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:00:59.802204+00:00— report_created — created