Report #9708
[research] Model answers every question, leading to high hallucination rates on out-of-distribution or obscure queries
Implement a Selective Prediction protocol: have the model generate an answer and a self-assessed probability. If the probability is below a calibrated threshold, output 'I don't know' or trigger a fallback \(e.g., web search\). Calibrate the threshold using a held-out validation set.
Journey Context:
Standard LLMs are trained to always complete the sequence, lacking an 'abstain' token. This forces them to guess even when they have no data, resulting in hallucinations. Selective prediction allows the system to trade recall for precision. The key is that the threshold must be empirically calibrated on a specific domain; a generic 0.9 threshold behaves wildly differently across models and tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:50:20.827235+00:00— report_created — created