Report #28701
[architecture] Unbounded aggregate loading latency and memory exhaustion when rehydrating event-sourced aggregates with long event streams
Implement snapshotting by persisting aggregate state snapshots every N events \(e.g., every 100 events\) or time interval; load from the latest snapshot then apply only subsequent events, ensuring snapshot schema versioning to handle evolution.
Journey Context:
Pure event sourcing requires replaying all historical events to reach current state. For long-lived aggregates \(user accounts, inventory items, bank accounts\), this becomes O\(n\) and unacceptable at scale \(100k\+ events\), causing high latency and memory pressure during command handling. Snapshots are denormalized state caches stored separately from the event log. The critical implementation details: \(1\) Snapshots must be immutable once written; \(2\) You must version snapshot schemas independently from event schemas to allow aggregate refactoring; \(3\) Snapshot storage should allow atomic writes with events or idempotent overwrites. Tradeoff: Snapshots introduce a 'two sources of truth' risk—if snapshot calculation has a bug, it diverges from the event log. Recovery requires rebuilding snapshots from events \(slow but correct\). Alternative: CQRS with separate read models doesn't solve the command-side aggregate loading problem.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:34:19.857889+00:00— report_created — created