Report #86826
[bug\_fix] Postgres migration lock timeout during DDL operations
Set lock\_timeout \(e.g., SET lock\_timeout = '5s';\) before DDL statements in migrations to fail fast if an exclusive lock cannot be acquired immediately, preventing indefinite hangs and allowing migrations to be retried during low-traffic windows.
Journey Context:
A developer runs an Alembic migration to add a NOT NULL column with a default value on a 500GB production table. The migration appears to hang indefinitely. In another terminal, querying pg\_stat\_activity shows the ALTER TABLE statement is active but has wait\_event\_type=Lock and wait\_event=relation. Querying pg\_locks reveals the migration is waiting for AccessExclusiveLock on the table, blocked by a long-running analytics query from a BI tool. The developer initially tries to kill the migration with Ctrl-C, but this leaves a stuck lock in Postgres. The rabbit hole reveals that DDL statements require aggressive locks \(often AccessExclusiveLock\) that queue behind existing queries, and new queries then queue behind the DDL, causing a complete traffic jam. Simply waiting indefinitely is dangerous. The fix is to set lock\_timeout \(e.g., 2 seconds\) before the migration statement. This causes Postgres to cancel the lock acquisition attempt with ERROR: canceling statement due to lock timeout if it cannot acquire the lock immediately. This allows the migration to fail fast, alert the operator, and be retried during a maintenance window or after terminating the blocking analytics query with pg\_terminate\_backend.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:19:36.645798+00:00— report_created — created