Report #16102
[architecture] Missed scheduled jobs or resource contention from cron storms
Use a message queue \(SQS, RabbitMQ, etc.\) with delayed delivery or scheduled visibility timeouts for user-triggered deferred work. Reserve cron only for internal idempotent maintenance tasks like log rotation or batch aggregation where missed runs are acceptable.
Journey Context:
Cron is simple but has critical failure modes: if a job takes longer than the interval, overlapping runs cause resource exhaustion; if the server is down at the scheduled time, the job is lost unless using complex distributed cron. Distributed cron \(like Kubernetes CronJob\) adds operational overhead without solving the 'at-least-once' delivery problem for business-critical operations. For user-facing deferred work \(emails, webhooks, order processing\), use a queue with at-least-once delivery and idempotent consumers. This ensures work survives process restarts and scales horizontally without coordination. Cron should be relegated to 'nice-to-have' cleanup where idempotency and timing are loose.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T01:50:27.107190+00:00— report_created — created