Report #85872
[architecture] Time-series query performance degrades dramatically with UUID primary keys in TimescaleDB
Use composite primary keys of \(time, device\_id\) or \(time, sequence\) for hypertables, not random UUIDs. If globally unique IDs are needed, use ULIDs or Snowflake IDs that are time-ordered, and place the time column first in the primary key to ensure correlation with physical storage order.
Journey Context:
TimescaleDB \(and PostgreSQL in general\) stores data in heap pages on disk. When using random UUIDs \(v4\) as primary keys, each new row is inserted into a random physical location in the B-tree index and the heap, causing high write amplification, frequent page splits, and rapid index bloat. For time-series data, queries almost always filter on time ranges \(e.g., 'last 1 hour'\). If the primary key is \(id\) and time is a secondary column, the database must scan the entire index or table to find recent rows. By using a composite key with time as the leading column \(e.g., \(time, device\_id\)\), the database can use a BRIN index or a B-tree range scan to fetch only the relevant disk blocks. This also enables efficient chunk pruning in TimescaleDB. The exception: if you need to lookup individual rows by ID often, a secondary index on UUID is acceptable, but never make it the primary key for time-series.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:43:24.268280+00:00— report_created — created