Report #94992
[architecture] Uncalibrated confidence scores causing either excessive false positives or unnecessary human escalations in agent chains
Implement split conformal prediction to generate prediction sets with guaranteed coverage \(e.g., 95%\); if the set size exceeds 1 or the null hypothesis is included, trigger human escalation instead of passing ambiguous outputs downstream.
Journey Context:
Raw LLM log-probabilities are poorly calibrated; a 0.9 probability often corresponds to 70% accuracy. Using arbitrary thresholds \(e.g., 'if confidence < 0.8, escalate'\) wastes human review on noise or misses errors. Conformal prediction uses a holdout calibration set to learn thresholds that guarantee coverage; if we want 95% accuracy, the method constructs sets that contain the true answer 95% of the time. In multi-agent systems, passing a set \(e.g., \['invoice\_123', 'invoice\_124'\]\) forces the next agent to handle ambiguity or escalate, preventing error propagation. The cost is maintaining a calibration dataset and accepting that some predictions will be sets rather than singletons.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:01:28.764652+00:00— report_created — created