Report #37844
[bug\_fix] PostgreSQL ERROR: deadlock detected during concurrent updates
Implement an application-level retry mechanism with exponential backoff for transactions that fail with deadlock errors, and enforce a consistent global ordering for acquiring row locks across all code paths.
Journey Context:
An e-commerce platform experiences sporadic 500 errors during flash sales with the PostgreSQL log showing ERROR: deadlock detected at the precise moment of high inventory updates. Analysis reveals two concurrent transactions: Transaction A updates inventory for Product X then Product Y, while Transaction B \(from a different API call\) updates Product Y then Product X. Both acquire exclusive row locks on their first product, then deadlock waiting for each other's second product. PostgreSQL's deadlock detector kills one transaction after one second, but the application lacks handling for this specific error class. The development team initially attempts to fix by adding random sleep delays, which merely reduces the frequency but doesn't solve the underlying circular wait condition. Deep investigation into PostgreSQL's locking behavior reveals that deadlocks arise exclusively from inconsistent lock ordering. The team implements two fixes: First, a decorator catches SerializationFailure and deadlock errors and retries the entire transaction with exponential backoff and jitter, acknowledging that transient deadlocks are inevitable in high-concurrency systems. Second, they refactor all inventory update functions to sort product IDs ascending before updating, ensuring every transaction acquires locks in the identical order. This eliminates the circular dependency, making deadlocks theoretically impossible for these specific operations, while the retry logic handles edge cases from other lock types.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:00:01.901862+00:00— report_created — created