Report #100768
[research] Model answers confidently when it is actually guessing
Elicit and act on calibrated uncertainty: prompt the model to express confidence \('I am uncertain...'\) and set an abstention threshold so low-confidence claims trigger retrieval or a refusal rather than being acted on.
Journey Context:
LLMs are often overconfident, and fluent output is read as reliable. Teaching models to verbalize uncertainty improves calibration, but raw confidence phrases are noisy, so they should be paired with a decision rule that routes uncertain claims to external verification or a safe 'I don't know' response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T05:03:41.727743+00:00— report_created — created