Report #73477
[architecture] Online schema migrations on large MySQL tables cause replication lag and locks
Use gh-ost \(GitHub's online schema migration tool\) for MySQL instead of native online DDL or pt-online-schema-change. gh-ost reads the binary log \(replica\) to propagate changes to the shadow table, avoiding triggers which increase write amplification. Run with --max-load Threads\_running=25 --chunk-size=1000 --throttle-control-replicas to throttle based on replication lag. For PostgreSQL, use ALTER ... ADD COLUMN with default \(11\+ is metadata-only\) or pg\_repack for rewrites, avoiding VACUUM bloat by creating new table and swapping.
Journey Context:
Native ALTER TABLE on MySQL 5.7 and earlier copies the table to a tmp table, blocking writes for hours on 100GB\+ tables. pt-online-schema-change uses triggers on the original table to capture changes, adding overhead to every write \(10-20% performance hit\) and failing if the table has triggers already. gh-ost connects to a replica, reads the binlog to apply changes to the ghost table asynchronously, cutting trigger overhead. It allows throttling based on replication lag. For PostgreSQL, the pain point is different: adding columns with defaults was slow pre-11; rewriting tables \(ALTER TYPE\) requires ACCESS EXCLUSIVE lock. pg\_repack and pg\_squeeze allow online reorganization without the lock.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T05:55:28.890757+00:00— report_created — created