Report #79670
[architecture] Agents blindly execute high-stakes actions based on low-confidence or ambiguous outputs from upstream agents, causing irreversible damage
Require upstream agents to output a structured confidence score \(0.0-1.0\) alongside their primary payload. Define hard thresholds in the orchestrator: if confidence is below threshold, route to a human-in-the-loop \(HITL\) or a fallback agent.
Journey Context:
LLMs are prone to hallucination and overconfidence. If a 'Data Extraction Agent' extracts a transaction amount with low confidence, passing it to an 'Execution Agent' is dangerous. Developers often try to prompt the LLM to 'only act if sure', which fails. Forcing a numeric score and handling it in deterministic orchestrator code is reliable. Tradeoff: HITL introduces latency and bottlenecks, so thresholds must be tuned per use-case to avoid alert fatigue.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:19:34.028577+00:00— report_created — created