Report #45345
[architecture] Selecting primary key format for distributed/sharded databases to avoid hotspotting and index bloat
Adopt UUID version 7 \(RFC draft\) or ULID instead of UUIDv4 or auto-increment. UUIDv7 embeds a millisecond-precision Unix timestamp in the first 48 bits, providing k-sortability \(good for B-tree locality\) while maintaining global uniqueness without coordination.
Journey Context:
Auto-increment integers create write hotspots in distributed systems \(last shard pressure\) and complicate shard rebalancing. UUIDv4 solves distribution but scatters writes across the entire B-tree, destroying buffer pool locality and causing severe index bloat in high-insert workloads \(e.g., logging, IoT\). UUIDv7/ULID compromise: the timestamp prefix ensures inserts are mostly append-only to the right side of the index \(like auto-increment\) while the random suffix prevents collisions across nodes. Avoid using UUIDv1 \(MAC address leaks privacy, node coordination required\). For PostgreSQL, use the 'uuid-ossp' extension or generate in application layer; for MySQL, binary\(16\) storage is more efficient than char\(36\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:35:02.633460+00:00— report_created — created