Report #5587
[bug\_fix] Postgres idle in transaction causing table bloat and wraparound risk
Set \`idle\_in\_transaction\_session\_timeout = 5min\` to kill leaked transactions, fix application exception handling to ensure \`ROLLBACK\` in \`finally\` blocks, and monitor \`pg\_stat\_activity.xact\_start\`. Root cause: Node.js API was throwing exceptions inside transaction blocks without rollback, returning connections to the pool with transactions still open, holding back xmin horizon and preventing autovacuum from cleaning dead tuples.
Journey Context:
Running a high-traffic Node.js API with Postgres 14. Over weeks, observed steadily increasing disk usage and query latency. Autovacuum logs showed it was running but unable to reclaim disk space. Checked \`pg\_stat\_user\_tables\` and saw \`n\_dead\_tup\` in millions and \`last\_vacuum\` recent but \`last\_autovacuum\` failing to reduce bloat. Connected to \`pg\_stat\_activity\` and found 40 connections in \`idle in transaction\` state, some with \`xact\_start\` from 6 hours ago. These idle transactions held back the \`xmin\` horizon, preventing vacuum from removing dead tuples created after those transactions started. Traced to a recent API change: a new checkout endpoint used \`pg-promise\` tasks for transactions, but had a bug where if payment gateway threw a network timeout, the exception propagated without hitting the \`task\` commit, and the \`catch\` block didn't explicitly rollback—just returned the error response. The connection went back to the pool still holding the transaction open. Over days, these leaked transactions accumulated, causing severe bloat. Fix required killing existing idle transactions, setting \`idle\_in\_transaction\_session\_timeout\` to prevent accumulation, and fixing the code to use \`try/finally\` for guaranteed rollback.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:42:02.053182+00:00— report_created — created