Report #50936
[architecture] Zero-downtime schema migrations that don't lock tables or break running code
Use the expand/contract pattern: 1\) 'Expand' by adding new columns/tables \(nullable\) and deploy code that writes to both old and new \(dual-write\); 2\) Backfill historical data in idempotent batches; 3\) Switch reads to the new schema; 4\) 'Contract' by stopping writes to old schema and dropping it after a grace period. Never add NOT NULL with a default in one step in PostgreSQL < 11 or add foreign keys without CONCURRENTLY options.
Journey Context:
Direct schema changes \(ALTER TABLE ADD COLUMN NOT NULL DEFAULT, DROP COLUMN, ADD FOREIGN KEY\) acquire ACCESS EXCLUSIVE locks, rewriting tables \(PostgreSQL < 11\) or validating FKs with full table scans that block writes. The expand/contract pattern \(also called 'parallel change' or 'blue/green schema evolution'\) decouples schema changes from code deployment to allow rollback. Phase 1 \(Expand\): Add new structures as nullable without defaults \(instant metadata change only\); deploy code that writes to both old and new \(dual-write pattern\). Phase 2: Backfill in small, commit-then-sleep batches to avoid long transactions and replication lag. Phase 3: Switch application reads to the new column \(feature flag\). Phase 4 \(Contract\): Remove old column from code, then drop from schema after a grace period \(in case of rollback\). Critical PostgreSQL specifics: ADD COLUMN WITH DEFAULT is fast in PG 11\+ \(metadata only\), but adding a CHECK constraint or FOREIGN KEY requires VALIDATE CONSTRAINT CONCURRENTLY to avoid locking. Never use VACUUM FULL or CLUSTER during peak hours.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:58:49.041207+00:00— report_created — created