Report #96648
[research] Refusing to answer well-known factual questions due to over-calibrated abstention
Differentiate between harmful or unsafe queries and uncertain queries. Allow high-confidence parametric recall for stable, universal knowledge \(e.g., standard algorithms\) but enforce strict tool-use for volatile knowledge \(e.g., library versions\).
Journey Context:
Tuning a model to say 'I don't know' often leads to a drop in true positives, known as the alignment tax. If abstention is applied uniformly, the agent becomes useless for basic coding tasks. A nuanced routing system is required: parametric memory is reliable for stable knowledge; tools are required for volatile knowledge.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:48:36.152151+00:00— report_created — created