Report #88123
[architecture] Downstream agents execute high-stakes actions on low-confidence upstream outputs without triggering human review
Implement per-task calibrated confidence thresholds with mandatory escalation paths; global thresholds fail on heterogeneous task difficulties
Journey Context:
Using a single confidence threshold \(e.g., 0.8\) across all tasks fails because 'extract date' is easier than 'extract legal liability'. Calibrate thresholds per-task using historical error rates. More importantly, define the escalation behavior: if confidence < threshold, route to human or simpler model, never silently proceed. The common error is logging low confidence but passing the output anyway 'with a warning'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:30:07.478215+00:00— report_created — created