Report #90172
[architecture] Silent hallucinations or confident failures propagating down the agent chain
Require agents to output a structured confidence score \(0.0-1.0\) and explicit assumptions alongside their primary result; trigger an escalation/human-in-the-loop if confidence is below a threshold or assumptions conflict with known state.
Journey Context:
Agents often guess rather than admit failure. Passing a guessed answer to the next agent compounds the error. People try asking 'are you sure?', which just generates more confident hallucinations. By forcing a structured confidence field and a list of assumptions, the orchestrator can programmatically check the trust boundary. If confidence is low, it routes to a human or a verifier agent instead of the next step.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:56:51.229403+00:00— report_created — created