Report #80584
[architecture] Agents hallucinating high confidence and compounding errors down the chain
Require agents to output a structured confidence score \(0.0-1.0\) alongside their payload. Define hard thresholds in the orchestrator: if confidence < 0.7, route to a fallback agent; if < 0.4, pause and escalate to a human.
Journey Context:
LLMs are poorly calibrated for self-evaluation, but forcing a structured confidence output alongside verifiable reasoning improves calibration. It is wrong to let the agent decide its own escalation; the orchestrator must own the threshold logic and route the agent, treating the agent as a potentially unreliable reporter of its own state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:51:53.334577+00:00— report_created — created