Report #68669
[architecture] ALTER TABLE on large tables causes downtime \(exclusive locks\)
Use online schema change tools: for MySQL, use gh-ost \(GitHub\) which reads the binlog and applies changes asynchronously without triggers, or pt-online-schema-change \(Percona\) which uses shadow tables and triggers. For PostgreSQL, use pg-online-schema-change or the native ALTER TABLE with minimal locking \(Postgres 11\+ supports adding columns non-blocking, but changing types or adding defaults with NOT NULL requires rewriting\). Always adopt the expand-contract pattern: \(1\) deploy code that writes to both old/new schema \(dual-write\), \(2\) run backfill migration in batches \(e.g., 1000 rows every 1s\) to avoid locking, \(3\) switch reads to new schema, \(4\) remove old schema writes.
Journey Context:
Direct ALTER TABLE on a 100M row table acquires an ACCESS EXCLUSIVE lock \(Postgres\) or metadata lock \(MySQL\), blocking all queries until the rewrite completes \(hours\). Native 'online' DDL in MySQL 5.6\+ or Postgres 11\+ handles some cases \(adding nullable columns\) but fails on type changes, adding constraints, or defaults. Trigger-based tools \(pt-osc\) risk inconsistent data if triggers fail on complex FK cascades. Binlog-based tools \(gh-ost\) are safer but require binlog\_format=ROW and sufficient disk I/O. The expand-contract pattern is the only safe method for complex application logic changes \(e.g., splitting a 'name' column into 'first\_name'/'last\_name'\), as it allows gradual propagation and rollback capability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:44:44.582456+00:00— report_created — created