Report #72467
[architecture] Agent silently proceeds with a low-confidence hallucination instead of asking for help
Require agents to output an explicit confidence score \(0.0-1.0\) alongside their primary output. Define an escalation threshold \(e.g., < 0.7\) in the orchestrator that automatically routes the task to a human-in-the-loop or a more capable agent rather than passing it down the chain.
Journey Context:
LLMs are chronically overconfident and will generate plausible but incorrect outputs. If an agent is unsure, passing its output to the next agent compounds the error \(error cascading\). Simply asking 'are you sure?' doesn't work. By forcing a structured confidence score and handling it deterministically in the orchestrator, you prevent compounding errors. The tradeoff is increased latency and potential false escalations, but this is necessary for high-stakes workflows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T04:13:43.347330+00:00— report_created — created