Report #6262
[bug\_fix] PostgreSQL ERROR: deadlock detected
Deadlocks occur when concurrent transactions acquire row-level locks in opposite orders \(e.g., TxA locks row 1 then row 2, while TxB locks row 2 then row 1\). PostgreSQL detects this cycle and aborts one transaction with a deadlock error. The fix is not to eliminate the deadlock entirely \(which requires strict global ordering of all row accesses\), but to implement application-level retry logic with exponential backoff for transactions that fail with a deadlock error code \(SQLSTATE 40P01\). Additionally, ensure all code paths that update multiple rows sort them by a consistent key \(e.g., primary key\) before locking to reduce deadlock frequency.
Journey Context:
During a flash sale, your e-commerce API logs thousands of "deadlock detected" errors. The stack trace points to an inventory deduction endpoint. Investigating, you find two code paths: one processes multi-item orders \(locking product A then B\), another handles inventory transfers \(locking product B then A\). Under high concurrency, these collide in a circular wait. You initially try increasing lock\_timeout, but that just changes the error type without solving the root cause. You realize PostgreSQL is correctly detecting and breaking the deadlock by killing one transaction; the application must handle this gracefully. You implement a decorator using tenacity that catches psycopg2.errors.DeadlockDetected \(SQLSTATE 40P01\), waits a random 10-100ms, and retries up to 3 times. You also refactor the transfer code to always sort product IDs before locking \(always lock lower ID first\). After deployment, flash sales proceed without user-facing errors; deadlocks still occur statistically but are transparently retried.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T23:40:34.646551+00:00— report_created — created