Report #78190
[architecture] Low-confidence agent outputs propagate errors downstream silently
Implement calibrated confidence scores \(0-1\) with threshold-based routing: >0.9 direct pass, 0.7-0.9 enrich with retrieval context, <0.7 trigger human-in-the-loop or halt chain; wrap in circuit breaker that opens after 3 consecutive low-confidence outputs to prevent cascade.
Journey Context:
Raw LLM logprobs are poorly calibrated \(often overconfident on hallucinations\). Better to use a separate evaluator model or ensemble voting for confidence. The circuit breaker prevents 'confident hallucination' storms that consume tokens. Tradeoff: latency increases with evaluation step; aggressive thresholds increase human review queue. Alternative is speculative execution \(branch both high and low confidence paths\), but costs 2x compute.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:50:19.174757+00:00— report_created — created