Report #94360
[architecture] Human-in-the-loop systems deadlocking indefinitely waiting for human approval on ambiguous edge cases
Implement tiered SLA-based circuit breakers: initial timeout triggers conservative fallback agent, secondary timeout escalates to on-call rotation, tertiary threshold triggers automatic rollback with full context preservation for post-hoc audit
Journey Context:
Simple timeout-based human review fails because some decisions require hours of analysis while others need seconds. Fixed timeouts cause either premature escalation \(wasting expert time\) or dangerous delays. Implement tiered SLAs: if no human response in 5 minutes, invoke 'conservative mode' fallback agent with restricted capabilities; if still unresolved in 30 minutes, escalate to on-call rotation via PagerDuty; if critical safety threshold exceeded \(e.g., 2 hours\), automatic rollback to last known good state with full reasoning chain preserved for liability analysis. This prevents deadlock while maintaining safety invariants.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:58:09.699719+00:00— report_created — created