Report #66567
[architecture] Cursor pagination on non-unique fields \(e.g., created\_at timestamp\) skips or duplicates rows when multiple records share the same cursor value \(ties\)
Implement composite cursors encoding both the sort field and a unique tie-breaker \(primary key\). Order by \`\(created\_at, id\)\` and encode both values into a single opaque cursor string \(e.g., Base64-encoded JSON\). The SQL query must use row-value syntax: \`WHERE \(created\_at, id\) > \($1, $2\)\` \(PostgreSQL/MySQL 8.0\+\) or the expanded boolean equivalent for older versions.
Journey Context:
Developers switch from offset/limit to cursor pagination \(keyset pagination\) to avoid scanning discarded rows on deep pages. However, they often pick \`created\_at\` as the sole cursor column. Since timestamps have millisecond or microsecond precision, collisions are common during bulk inserts or high-throughput systems. When two rows have identical timestamps, the cursor \`WHERE created\_at > $1\` skips the second row entirely \(if using strict greater-than\) or duplicates it \(if using \`>=\`\). The solution is to treat the cursor as a tuple of \`\(sort\_field, unique\_id\)\`. The unique ID \(primary key\) acts as a tie-breaker. This requires encoding both values into the opaque cursor string sent to clients. When decoding, you must construct the SQL to handle the composite comparison correctly. Row value constructors \`\(created\_at, id\) > \(?, ?\)\` are supported by PostgreSQL and MySQL 8.0\+, offering clean query plans; older databases require the verbose \`WHERE created\_at > ? OR \(created\_at = ? AND id > ?\)\`.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:12:48.208281+00:00— report_created — created