Report #35557
[architecture] Running ALTER TABLE on large production tables causes locks, downtime, or replication lag
Use Expand-Contract pattern: \(1\) Expand: Add new column/table as nullable/dual-write without breaking old code, \(2\) Migrate: Backfill data asynchronously, \(3\) Contract: Switch reads to new structure, then drop old column
Journey Context:
Direct DDL changes on large tables \(millions\+ rows\) in MySQL/Postgres often require ACCESS EXCLUSIVE locks or rebuild the entire table, causing minutes to hours of downtime. Tools like gh-ost or pt-online-schema-change use shadow tables and triggers/binlog streaming, but the Expand-Contract pattern is application-level and works across all databases. It allows zero-downtime by maintaining backward compatibility at each step. Critical: ensure idempotent backfills and handle the 'dual-write' period carefully to avoid data drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:09:02.985256+00:00— report_created — created