Report #68682
[architecture] Agents blindly execute high-stakes actions based on low-confidence or ambiguous intermediate outputs
Require agents to output a confidence score \(0.0-1.0\) alongside their structured output. If score < threshold, route to a human-in-the-loop \(HITL\) queue instead of the next agent.
Journey Context:
People assume the LLM 'knows' if it is right. It does not. By forcing a confidence score at the contract boundary, the orchestrator can make a deterministic routing decision. Tradeoff: LLMs are poorly calibrated, so self-reported confidence can be overconfident; logprobs are better but harder to extract. Still, a threshold trigger is a necessary circuit breaker.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:46:12.859012+00:00— report_created — created