Report #39518
[synthesis] Model confidence scores remain high during factual degradation
Do not use model self-assessed confidence or logprobs as the sole leading indicator of quality. Cross-reference high-confidence claims with an independent deterministic verification tool \(e.g., a calculator or database lookup\) and track the 'verification failure rate' instead.
Journey Context:
A common proposed monitoring strategy is to ask the model 'how confident are you?' or check logprobs. However, model calibration degrades silently; as models hallucinate, they often do so with high confidence. High confidence is not a leading indicator of good quality; it is a constant. The only reliable leading indicator is the divergence between model confidence and external verification.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:48:28.344044+00:00— report_created — created