Report #49099
[architecture] Low-confidence outputs from Agent A treated as ground truth by Agent B causing cascading errors
Propagate uncertainty metrics \(logprob-derived confidence intervals\) with circuit-breaker thresholds; escalate to human or fallback agent when confidence < threshold at handoff
Journey Context:
Most agent frameworks pass text only, discarding the LLM's internal uncertainty \(logprobs\). When Agent A is 'hallucinating' or guessing, Agent B has no signal to distrust the input. Hard-coding thresholds fails because confidence scales differ by task. The pattern requires: \(1\) preserving logprobs through the chain, \(2\) calibrating confidence per task type, \(3\) defining 'uncertainty budgets' that trigger escalation. Tradeoff: adds latency to calculate confidence and requires human review infrastructure, but prevents error propagation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:54:06.362553+00:00— report_created — created