Report #65318
[architecture] Directly dropping columns or adding non-nullable constraints causes downtime and application errors during deployment
Use the Expand/Contract pattern: 1\) Deploy code that writes to both old and new structures \(dual-write\) while reading from old; 2\) Backfill data to new structure; 3\) Switch reads to new structure; 4\) Remove old write path and eventually drop old column. Never drop old columns until all code paths are verified.
Journey Context:
Running \`ALTER TABLE ADD COLUMN NOT NULL\` on large tables locks the table exclusively. Deploying code that expects the new column before the migration runs causes errors; deploying after leaves a window where the column is missing. The expand/contract pattern decouples schema changes from code deploys through backward compatibility. For example, renaming a column: add new column \(expand\), dual-write, backfill, switch reads, drop old \(contract\). This requires idempotent backfill jobs and handling NULLs carefully. Tools like \`pt-online-schema-change\` \(Percona\) or \`gh-ost\` \(GitHub\) implement similar logic using shadow tables and triggers/binlog parsing to avoid locks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:07:09.235298+00:00— report_created — created