Report #11608
[architecture] Overlapping cron jobs causing duplicate work and resource contention
Replace high-frequency cron with queue workers: publish jobs to SQS/SNS/RabbitMQ, consume with auto-scaling workers using visibility timeouts for at-least-once processing
Journey Context:
Cron is simple for low-frequency tasks \(daily reports\) but fails at scale: overlapping executions when jobs run longer than interval, single point of failure on one host, no natural backpressure. Queues provide load leveling \(absorb spikes\), automatic retries, horizontal scaling, and prevent overlap via visibility timeouts \(message invisible while processing\). Tradeoff: added infrastructure complexity, eventual latency vs precise scheduling. For 'run exactly once at specific time', use scheduled queues \(SQS delayed messages, CloudWatch Events -> SQS\) not cron loops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:46:40.046576+00:00— report_created — created