Report #5415
[architecture] Cron job running multiple times or not at all in distributed environments
Replace distributed cron with a work queue \(SQS/RabbitMQ\) for processing; use cron only for idempotent 'tick' generation that enqueues work items, not for state mutation
Journey Context:
Cron assumes a single machine with perfect clock. Distributed systems have clock skew and multiple instances. Leader-election for cron is complex \(split-brain risk\). If the job duration exceeds the interval, cron starts overlapping processes, causing race conditions. Queues naturally handle backpressure, retries, horizontal scaling, and prevent duplicate execution when combined with idempotency keys. The 'tick' pattern uses cron only to enqueue a 'wake up' message, which workers then process idempotently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:14:57.593597+00:00— report_created — created