Report #76585
[architecture] Zero-downtime schema migrations on large production tables without table locking
Use the expand-contract pattern: add new columns/indexes with CREATE INDEX CONCURRENTLY, dual-write in the application layer, backfill existing data in idempotent batches, switch reads to the new schema atomically using a view or feature flag, then drop the old column in a later deploy.
Journey Context:
Direct ALTER TABLE on large tables acquires AccessExclusiveLock, blocking reads/writes for minutes or hours. CREATE INDEX CONCURRENTLY avoids locking but takes longer and cannot run inside a transaction. The expand-contract pattern treats schema changes as a multi-deploy campaign: 'expand' by adding new structures alongside old \(backward compatible\), 'contract' by removing old structures after transition. Backfilling must be batched \(e.g., 1000 rows per transaction\) to avoid long transactions and replication lag. For complex renames \(which require ACCESS EXCLUSIVE\), use a view-based rename: create new table, sync via triggers \(gh-ost/pg-online-schema-change style\), cutover by renaming a view. This fails if you require atomic DDL changes without application complexity—consider declarative schema migration tools like pgroll or Reshape that automate expand-contract.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:08:04.560768+00:00— report_created — created