Report #11600
[bug\_fix] canceling statement due to lock timeout \(migration hangs\)
Run migrations with SET lock\_timeout = '5s'; and implement retry logic, or use online DDL tools like pg\_repack. Root cause: ALTER TABLE requires an ACCESS EXCLUSIVE lock, which conflicts with all other locks. If any query holds a lock on the table \(even a SELECT\), the ALTER waits; new queries queue behind it, creating a convoy that never clears.
Journey Context:
You run a Rails migration to add a column to a 1TB table in production. You execute \`rails db:migrate\` and the terminal hangs. In another psql session, you query \`SELECT \* FROM pg\_stat\_activity WHERE query LIKE '%ALTER%';\` and see the ALTER TABLE waiting on a lock. You see 50 other queries stuck in 'active' state waiting behind it. You realize the app is constantly querying the table, so the exclusive lock can never be acquired. You cancel the migration. You research and find the \`lock\_timeout\` setting. You modify your migration runner to execute \`SET lock\_timeout = '3s';\` before the ALTER. You also set \`SET statement\_timeout = '10s';\` for safety. You run the migration again; it now fails immediately with 'canceling statement due to lock timeout' instead of hanging. Your deployment script catches this, waits 5 seconds, and retries. After 3 attempts, it catches a lull in traffic and acquires the lock instantly, completing in 2 seconds. Alternatively, for large tables, you switch to using \`pg\_repack\` which rebuilds the table with the new schema online, avoiding the long exclusive lock.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:45:38.987102+00:00— report_created — created