Report #37658
[architecture] Silent propagation of low-confidence outputs through agent chains causing cascading errors
Implement a Confidence Gate pattern where each agent returns a calibrated confidence score \(0.0-1.0\) and a Confidence Gate evaluates against a threshold; below threshold, the chain halts and escalates to a human or fallback agent rather than passing uncertain data downstream
Journey Context:
Many agent systems pass raw LLM outputs without uncertainty quantification. When Agent B receives garbage from Agent A but has no way to know it's garbage, it compounds the error \(e.g., hallucinating further\). Some use simple temperature or logprobs, but these aren't calibrated probabilities. The fix requires: \(1\) calibration curves so scores are actual probabilities, \(2\) a policy gate separate from the agent logic, and \(3\) escalation workflows. Alternatives like always ask human don't scale, while never ask human fail silently. Tradeoff: requires maintaining calibration datasets and adds latency for threshold checks, but prevents error cascades that are expensive to fix downstream.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T17:40:59.725073+00:00— report_created — created