Report #67605
[architecture] Event sourcing aggregate reconstruction taking too long due to replaying thousands of events
Implement snapshotting that stores the aggregate state periodically, not on every event. Use a versioning strategy: snapshot every N events \(e.g., every 100\) or based on aggregate age, and store snapshots in a separate table with \(aggregate\_id, version, state\_payload\). When loading, fetch the latest snapshot then replay only events after that version. Prefer 'snapshot on write' triggered by event count thresholds over 'snapshot on read' to avoid write amplification during hot read paths.
Journey Context:
Without snapshots, loading a long-lived aggregate \(e.g., a 10-year-old bank account\) requires replaying thousands of events, causing 100ms\+ latency and memory issues. Naive snapshotting on every event doubles write load. The threshold-based approach balances read performance against write overhead. 'Snapshot on read' \(updating snapshot during load if stale\) seems efficient but causes race conditions and write contention on popular aggregates. Versioned snapshots ensure idempotency and allow archiving of old events. Teams often skip snapshotting initially, then face painful migrations later.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T19:57:19.556949+00:00— report_created — created