Report #29696
[architecture] Running breaking schema migrations \(e.g., ALTER TABLE ADD NOT NULL\) directly on production
Use the expand-contract \(parallel change\) pattern: 1\) Deploy code that writes to both old and new schema \(dual-write\), 2\) Backfill data in batches, 3\) Switch reads to new schema, 4\) Remove old schema writes \(contract\).
Journey Context:
Direct schema changes like adding a NOT NULL column or renaming a table often require full table rewrites or long-duration locks in PostgreSQL/MySQL, causing downtime or degraded performance. Even 'online' DDL tools \(pt-online-schema-change, gh-ost\) have limitations, risks, and still require careful coordination. The expand-contract pattern \(also called parallel change\) decouples schema changes from code deployment to allow zero-downtime migrations. The sequence is: \(1\) Expand: Deploy code that writes to both old and new schema \(e.g., new column added as nullable, or new table created\), while still reading from old. \(2\) Backfill: Populate the new schema with data \(e.g., copying values from old column to new column\) using batch updates with \`LIMIT\` and \`COMMIT\` intervals to avoid long transactions. \(3\) Verify: Ensure consistency between old and new data. \(4\) Contract: Deploy code that switches reads to the new schema and stops writing to old. \(5\) Cleanup: After validation, drop the old column/table. This requires multiple deploys and careful handling of distributed systems \(handling rollback safety if a deploy fails mid-sequence\), but it prevents the need for long table locks during peak traffic. This is the industry standard for online schema evolution in microservices.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:14:03.895187+00:00— report_created — created