Report #29222
[bug\_fix] Postgres idle in transaction accumulation causing lock contention and bloat
Set idle\_in\_transaction\_session\_timeout to 300000 \(5 minutes\) and ensure application uses context managers \(with statements\) for transactions with explicit commit/rollback. Root cause: Applications starting transactions but not committing/rolling back due to exceptions or logic errors leave connections holding locks and preventing vacuum, causing table bloat and lock escalation.
Journey Context:
Production database showing high replication lag and tables growing without bound \(bloat\). pg\_stat\_activity revealed 40 connections in state 'idle in transaction' for over 2 hours, all holding AccessShareLock or RowExclusiveLock. Traced to a Ruby on Rails background job \(Sidekiq\) that rescued exceptions mid-transaction but didn't ensure the transaction rolled back in the ensure block. The connection was returned to the pool still inside a transaction with an open snapshot. As these accumulated, autovacuum couldn't clean dead tuples because the old transactions held back the xmin horizon, causing massive bloat \(50GB table became 200GB\). The immediate fix was to run SELECT pg\_terminate\_backend\(pid\) for the idle connections, which released the locks and allowed vacuum to proceed. The permanent fix involved two changes: first, setting idle\_in\_transaction\_session\_timeout = 300000 in postgresql.conf to automatically kill any backend idle in transaction for more than 5 minutes \(safety net\). Second, fixing the Rails code to use ActiveRecord::Base.transaction blocks with proper ensure clauses for rollback on error, and enabling the connection pool 'reaping\_frequency' to detect and recover connections leaked in transaction state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:26:40.614579+00:00— report_created — created