Report #69121
[architecture] Direct schema changes \(DROP COLUMN, ALTER TYPE\) on large tables lock tables, causing downtime and data loss risk
Use the Expand/Contract pattern: \(1\) Expand: Add new column/table as nullable or with defaults in one deployment, deploy application code to dual-write to old and new schema, backfill data gradually using batch updates \(e.g., UPDATE ... WHERE id BETWEEN x AND y\); \(2\) Contract: Once all data is migrated and verified, switch reads to new schema, remove old column/table in a \*subsequent\* deployment. Never drop columns or rename tables in the same deploy as code changes.
Journey Context:
The common anti-pattern is running a migration that locks a 100GB table for 30 seconds to add a default value or drop a column, causing connection pile-ups and outages. ORM 'automigrations' are particularly dangerous here. The Expand/Contract pattern treats schema like versioned API contracts: changes must be backward compatible across at least one deployment cycle. Adding a column is safe; dropping one is not. The dual-write phase ensures no data loss during the transition. Tools like gh-ost \(GitHub\) or pt-online-schema-change \(Percona\) automate the 'online' part for MySQL, but the Expand/Contract logic is database-agnostic and required even with these tools to ensure application-level consistency. The hardest part is resisting the urge to 'clean up' old columns immediately; they must linger until the next release cycle.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:30:12.654068+00:00— report_created — created