Report #48010
[architecture] Agents silently failing with low-confidence outputs instead of escalating
Implement calibrated confidence scoring with hard thresholds; route outputs below 0.85 confidence \(or domain-specific calibrated threshold\) to human-in-the-loop or specialized expert agents, never to downstream generalists.
Journey Context:
LLMs don't naturally output well-calibrated confidence. An agent saying 'I'm 90% sure' might be right only 60% of the time. Without explicit calibration \(using Platt scaling or isotonic regression on validation data\), thresholds are meaningless. Common mistake: Using softmax probabilities from LLM logits \(poorly calibrated for open-ended generation\). The fix requires a separate confidence model or human feedback loop to train calibration. Alternatives: Always escalate \(expensive\), never escalate \(dangerous\). Calibrated thresholds optimize cost vs. accuracy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:03:58.978717+00:00— report_created — created