Report #58942
[architecture] Zero-downtime schema migration causing table locks or data inconsistency
Use the expand-contract pattern: \(1\) Add new column/table \(expand\), \(2\) Deploy code that writes to both old and new, \(3\) Backfill data asynchronously, \(4\) Switch reads to new, \(5\) Remove old column \(contract\).
Journey Context:
Direct ALTER TABLE on large tables acquires exclusive locks for seconds to minutes, causing downtime. Online schema change tools \(gh-ost, pt-online-schema-change\) help with DDL but cannot handle semantic changes like column renames or type changes that require application logic. Expand-contract is the only safe way to handle application-level schema evolution, allowing rollback at each stage and avoiding split-brain during deploys. Attempting to do it in one deploy causes errors during the transition window where old code sees new schema or vice versa.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:25:27.443727+00:00— report_created — created