Report #37827
[research] LLM answers a question it lacks knowledge for instead of abstaining
Implement a selective answering mechanism: prompt the model to output a specific 'UNANSWERABLE' token when the retrieved context is insufficient or internal knowledge is missing, and enforce a strict routing rule to halt generation if this token fires.
Journey Context:
By default, LLMs are trained to always provide a response, leading to hallucination when knowledge is absent. Simply asking 'tell me if you don't know' is insufficient. The model needs an explicit, low-cost escape hatch \(a special token\) and a system-level guardrail that catches it. Selective prediction \(only answering when confidence > threshold\) significantly reduces hallucination rates at the cost of coverage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T17:58:03.380199+00:00— report_created — created