Report #43119
[architecture] LLMs hallucinate high confidence scores, making automated escalation triggers unreliable
Do not ask the LLM to output a 'confidence score' from 1-100. Instead, use a separate verifier model to assess the output against a rubric, or trigger human escalation based on deterministic operational signals \(e.g., multiple retries, schema validation failure, tool execution error\).
Journey Context:
A common pattern is asking an agent: 'Rate your confidence in this answer from 1-10.' LLMs are poorly calibrated and almost always report high confidence, making this useless for HITL triggers. The alternative is a separate 'LLM-as-a-judge' verifier, which is better but still stochastic. The most robust pattern is triggering HITL on deterministic operational signals using a circuit breaker. The tradeoff is that deterministic triggers might miss semantic errors, but they reliably catch the failures that break the workflow.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:50:56.941110+00:00— report_created — created