Report #50091
[bug\_fix] idle in transaction timeout
The root cause is application code failing to properly close transactions \(COMMIT or ROLLBACK\) due to missing error handling, early returns, or connection pool leaks. When a client connection enters 'idle in transaction' state, it holds all locks acquired during that transaction indefinitely, blocking vacuum operations and potentially causing lock waits for other queries. Additionally, these connections remain occupied, reducing the effective size of the connection pool. The fix requires code-level transaction hygiene and database safeguards: \(1\) Ensure all application transaction blocks use try/ensure \(Ruby\), try/finally \(Java\), or context managers \(Python\) to guarantee commit/rollback regardless of return paths or exceptions. \(2\) Configure the database parameter idle\_in\_transaction\_session\_timeout \(PostgreSQL 9.6\+\) to automatically terminate connections idle in transaction for longer than a threshold \(e.g., 5 minutes\). This acts as a circuit breaker to prevent indefinite resource leaks, though applications must handle the unexpected disconnection gracefully. \(3\) Monitor pg\_stat\_activity for 'idle in transaction' states using queries filtering where state = 'idle in transaction' and xact\_start is old, alerting when count exceeds threshold.
Journey Context:
Your Sidekiq jobs start failing with 'PG::UnableToSend: no connection to the server' or simply hanging. You check pg\_stat\_activity and see dozens of rows with state='idle in transaction', some from hours ago, all originating from Sidekiq PIDs. You check your Rails code and see complex transactions with early returns or exception handling that doesn't ensure rollback. You realize that when a Sidekiq job raises an exception caught by Sidekiq's middleware, Rails might not always return the connection to the pool with the transaction cleared, or an explicit 'return' inside a transaction block leaves the transaction open. The rabbit hole: checking log\_min\_duration\_statement shows no long queries, but the connections are held open indefinitely, eventually exhausting the pool and causing 'too many clients' or application-level pool timeouts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:33:38.138105+00:00— report_created — created