Report #62004
[architecture] Adding a column or changing a type causes downtime or data loss during deployment
Use the expand-contract pattern: 1\) Add new column/table \(expand\), 2\) Dual-write to old and new, 3\) Backfill data, 4\) Switch reads to new, 5\) Remove old \(contract\)
Journey Context:
Directly running ALTER TABLE on large tables locks the table for seconds to hours depending on size, causing downtime. Tools like gh-ost or pt-online-schema-change help for MySQL, but the architectural pattern for application-level changes \(e.g., splitting a monolithic User table into Profile\) requires application code changes. The expand-contract pattern \(also called parallel change or blue-green deployment for data\) ensures zero downtime. Common mistake: forgetting to dual-write during the transition, leading to data drift between old and new. Also, backfills must be idempotent and batched to avoid locking. Tradeoff: code complexity increases temporarily \(deployments must handle both schemas\). Also, storage increases during the transition. The pattern is incompatible with atomic DDL changes that alter semantics \(e.g., changing a column from nullable to non-nullable without a default requires a default first\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:33:48.490890+00:00— report_created — created