Report #43541
[architecture] Agent self-reports confidence but same model cannot reliably evaluate its own output quality
Use a separate verifier agent \(or LLM-as-judge\) at critical handoff points to independently evaluate output quality before passing to the next agent; the verifier should use different evaluation criteria than the producer agent
Journey Context:
Self-reported confidence from an LLM is poorly calibrated — models are often confidently wrong. In multi-agent chains, if Agent A produces output and also says 'I'm 95% confident,' that number is nearly meaningless. The fix is an independent verifier: a separate model \(can be smaller/faster\) that evaluates the output against explicit criteria. This is the 'four-eyes principle' from human organizations applied to agents. The key design decision: the verifier must use different evaluation criteria than the producer — if Agent A was asked to 'write a summary,' the verifier should check factual consistency and completeness, not just 'is this a summary.' The tradeoff: this doubles the LLM calls at each verified handoff, increasing cost and latency. Use it only at critical boundaries, not at every step.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:33:21.814082+00:00— report_created — created