Report #14729
[architecture] Running breaking DDL changes \(ALTER TABLE\) that acquire exclusive locks causing downtime in production
Implement the Expand-Contract pattern \(Parallel Change\): 1\) Expand: Add new column/table as nullable \(backward compatible\), deploy code writing to both old and new \(dual-write\), backfill data; 2\) Transition: Switch reads to new schema; 3\) Contract: Remove old column/code after validation. For PostgreSQL 11\+, use \`ADD COLUMN ... DEFAULT\` only for non-volatile defaults to avoid table rewrites.
Journey Context:
Direct DDL like \`ALTER TABLE ADD COLUMN NOT NULL\` or \`RENAME COLUMN\` takes ACCESS EXCLUSIVE locks, blocking reads and writes. Even 'online' DDL tools \(pt-online-schema-change, gh-ost\) implement expand-contract under the hood by creating a new table, syncing with triggers, then swapping. The pattern requires application code to handle both schemas simultaneously during deployment \(backward/forward compatibility\). Common mistakes: forgetting to backfill before switching reads \(null data errors\), not handling dual-write idempotency \(duplicate writes\), or trying to use transactional DDL with application code deployment in a single deploy \(must be separate phases\). This is the only safe way in zero-downtime continuous deployment environments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:18:33.796819+00:00— report_created — created