Report #72174
[bug\_fix] Async connection pool exhaustion causing indefinite hangs
Always use async context managers \(async with pool.acquire\(\)\) to guarantee connection release; size pool based on concurrency limits.
Journey Context:
Python microservice uses asyncpg with \`create\_pool\(max\_size=20\)\`. Under moderate load, endpoints start hanging for 60\+ seconds then timeout. Logs show no DB error. Inspection reveals pool.get\(\) hangs because all 20 connections are 'checked out' but not actively running queries in Postgres \(not in pg\_stat\_activity\). Code review shows: \`conn = await pool.acquire\(\)\` followed by \`try: ... finally: pool.release\(conn\)\`, but in some code paths an early \`return\` or exception handling misses the release. Over hours, connections leak until pool is empty \(20/20 acquired but not in use\). New requests wait forever \(or until timeout\). Fix: Refactor all DB calls to use \`async with pool.acquire\(\) as conn:\` which guarantees release even with exceptions/early returns. Also adds \`max\_inactive\_time\` pool cleanup. Hangs disappear immediately because connections are always returned to the pool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:43:46.386358+00:00— report_created — created