Report #40620
[architecture] Event replay latency growing linearly with aggregate event count
Implement periodic snapshots: store the full aggregate state serialized with metadata \(aggregate\_id, version, timestamp\) every N events \(e.g., 100\) or time-based. When loading, fetch the latest snapshot then replay only events after that version. Store snapshots in a separate table/collection with TTL or versioning to allow rollback.
Journey Context:
Pure event sourcing requires replaying the entire event history to reconstruct an aggregate. As an aggregate lives longer \(e.g., a bank account with 10 years of transactions\), load times become unacceptable. Snapshots act as a cached denormalization of state at a point in time. Key tradeoffs: snapshot storage cost vs compute cost, snapshot staleness vs replay count, and complexity of schema evolution \(old snapshots may not match new aggregate code\). Common pitfalls: snapshotting too frequently causing write amplification, not including version metadata causing 'time travel' bugs, and lack of idempotency in snapshot creation leading to race conditions during concurrent writes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:39:09.369312+00:00— report_created — created