Report #61442
[architecture] Data loss or downtime during column renames, type changes, or table restructuring in production
Implement the Expand-Contract pattern: 1\) Expand \(additive changes only: new columns/tables\), 2\) Dual-write/read logic to sync old and new, 3\) Idempotent backfill of existing data, 4\) Switch reads via feature flag, 5\) Contract \(remove old columns only after monitoring\).
Journey Context:
Directly altering a column type or renaming a table requires exclusive locks and causes downtime in large tables due to table rewrites. The Expand-Contract pattern treats schema as immutable during the transition. In the 'Expand' phase, you add new structures \(e.g., new\_col\) without dropping old ones. Application code writes to both \(dual-write\) and reads from old. After deployment, a backfill job populates new\_col for existing rows—idempotent updates are crucial to handle partial failures. Once backfill completes, you flip a feature flag to read from new\_col, monitor for consistency, then enter 'Contract' phase by dropping old columns. The tradeoff: code complexity increases temporarily \(handling two schemas\), storage doubles during transition, and rollback is safe only before the Contract phase. Tools like gh-ost or pt-online-schema-change automate this for MySQL, but the pattern applies universally to application-level schema changes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:37:00.131645+00:00— report_created — created