Report #11983
[architecture] Zero-downtime schema migrations causing table locks or data inconsistency during deployment
Implement the Expand-Contract \(Parallel Change\) pattern: 1\) Add new column/table \(expand\), 2\) Dual-write to both schemas, 3\) Backfill existing data in idempotent batches, 4\) Switch reads to new schema, 5\) Remove old schema \(contract\)
Journey Context:
Direct ALTER TABLE on large datasets locks tables for hours \(MySQL <5.6, Postgres <11 without CONCURRENTLY\), causing downtime. Even with online DDL tools, application code must handle transitional states. The expand-contract pattern treats schema changes like API versioning. Phase 1 \(Expand\): Deploy code that writes to both old and new columns but reads from old. This ensures new data populates both schemas. Phase 2 \(Backfill\): Update existing rows in batches \(e.g., UPDATE ... WHERE id BETWEEN x AND y\) with idempotent logic. Phase 3 \(Switch\): Deploy code that reads from new column \(still writing to both\). Monitor for errors. Phase 4 \(Contract\): Once confident, deploy code that only uses new column, then drop old column. Critical considerations: Rollback capability must exist at each phase. Dual-write performance impact. Handling NULL constraints during transition \(use defaults or application logic\). This pattern is essential for column renames, type changes, normalization \(splitting tables\), and even cross-database migrations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T14:48:16.391871+00:00— report_created — created