Report #75266
[bug\_fix] ERROR: deadlock detected
Implement idempotent retry logic with exponential backoff for transactions that fail with SQLSTATE 40P01, and strictly enforce a global lock acquisition order \(e.g., always lock rows by account\_id ASC\) across all application codepaths. Root cause: PostgreSQL's lock manager detects a circular wait-for graph \(e.g., Transaction A holds Lock 1 and waits for Lock 2, while Transaction B holds Lock 2 and waits for Lock 1\) and forcibly aborts one 'victim' transaction to break the cycle; the application must retry the aborted work.
Journey Context:
Your financial ledger microservice began sporadically throwing 'deadlock detected' during high-volume batch transfers. You examined pg\_locks and pg\_stat\_activity during an incident: Transaction 12345 held a RowExclusiveLock on Alice's account and requested one on Bob's, while Transaction 12346 held Bob's and requested Alice's. You initially attempted to increase deadlock\_timeout to 10s, hoping to reduce false positives, but this merely delayed the inevitable error because the circular dependency is structural. You realized PostgreSQL automatically kills one transaction \(the one it decides is cheaper to rollback\), emitting SQLSTATE 40P01. The fix wasn't to prevent deadlocks entirely \(impossible under true concurrency\), but to handle them. You wrapped the transfer logic in a retry loop that specifically caught 40P01 and re-executed the transaction. Additionally, you refactored the SQL to always lock accounts in ascending order by account\_id using 'SELECT FOR UPDATE', which serializes access and prevents the circular dependency from ever forming.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:55:41.196805+00:00— report_created — created