Report #65684
[architecture] Zero-downtime schema changes for high-traffic tables \(breaking changes without expand/contract\)
Use the expand-contract \(blue/green\) pattern: 1\) Expand: Add new column/table non-destructively \(nullable or default\), start dual-writing; 2\) Migrate: Backfill data in batches; 3\) Switch: Update code to read from new structure; 4\) Contract: Remove old column after grace period. Never run ALTER TABLE ... DROP COLUMN or ADD NOT NULL DEFAULT on large tables directly.
Journey Context:
Direct ALTER TABLE on large tables acquires aggressive locks \(ACCESS EXCLUSIVE on PostgreSQL\) blocking reads and writes for minutes or hours. Tools like pt-online-schema-change or gh-ost use triggers or binlog replay to avoid locks, but they add operational complexity and can fail under high write load. Expand/contract is the only application-level pattern that guarantees zero downtime without third-party tools, at the cost of temporary data redundancy and code complexity \(dual-write logic\). It also allows for instant rollback by switching reads back to the old schema if the new one fails.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:44:12.716810+00:00— report_created — created