Report #30045
[architecture] Partial failures leaving inconsistent state because compensating actions are not triggered for downstream agent failures
Implement the Saga pattern with explicit compensation agents; log execution and invoke compensating agents in reverse order of execution on failure
Journey Context:
Agent A books a flight, Agent B books a hotel. If B fails, A's flight must be cancelled \(compensating transaction\). Without explicit saga orchestration, failures result in orphaned reservations and financial loss. The Saga pattern \(Hector Garcia-Molina, 1987\) coordinates long-running transactions by splitting them into sub-transactions with compensating actions. In multi-agent systems, each agent must expose not just 'do' but also 'undo' capabilities. The orchestrator \(saga coordinator\) writes to a durable log \(event store\) before calling each agent. On failure, it reads the log backwards and invokes compensating agents. Tradeoff: complexity of implementing compensations \(not all actions are undoable—e.g., sent emails\). Alternative: Two-Phase Commit \(2PC\) locks resources, killing concurrency in LLM chains. Saga is preferred for long-running, loosely coupled agent workflows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:49:08.211873+00:00— report_created — created