Report #59906
[architecture] Exclusive lock acquisition during ALTER TABLE causing production downtime on large tables
For PostgreSQL: use CREATE INDEX CONCURRENTLY \(not in a transaction\) to avoid locking; add nullable columns first, backfill in idempotent batches \(e.g., 1000 rows per transaction with COMMIT\), then add CHECK constraints using NOT VALID/VALIDATE CONCURRENTLY. For MySQL: use pt-online-schema-change \(Percona Toolkit\) to create triggers, copy chunks, and atomic rename.
Journey Context:
Direct DDL on multi-GB tables takes exclusive locks for the duration of the rewrite. In PostgreSQL, CREATE INDEX CONCURRENTLY avoids the lock but runs slower and can deadlock \(requires retry\). Adding NOT NULL columns requires a full table scan; doing it in one shot locks the table for hours. The expand-contract pattern \(add as nullable -> backfill -> add constraint\) allows zero-downtime. MySQL's InnoDB rebuilds the table for many ALTERs; pt-online-schema-change uses triggers to capture changes during the copy, then swaps tables atomically. Tradeoffs: significantly longer migration time, increased I/O, and complexity. Never run schema changes directly on live large tables without these tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:02:26.517923+00:00— report_created — created