Report #99822
[research] LLM answers confidently on topics outside its reliable knowledge
Elicit calibrated uncertainty explicitly. For niche libraries, internal APIs, version-specific behavior, or fast-changing domains, default to retrieval or tool use rather than parametric memory, and instruct the model to say 'I don't know' when it cannot verify.
Journey Context:
Models are often overconfident by default, but Kadavath et al. showed they can express well-calibrated uncertainty when prompted correctly. Coding agents hit this constantly with new packages, proprietary codebases, and recent releases. The mistake is treating fluent output as reliable. The robust pattern is to ask for confidence, route uncertain questions to search or execution, and reward 'I don't know' over plausible guesses.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:07:07.210090+00:00— report_created — created