Report #38828
[bug\_fix] FATAL: sorry, too many clients already
The immediate fix is to lower the client-side connection pool size \(e.g., reduce DB\_POOL\_SIZE in the app\) so that \(pool\_size \* instance\_count\) < max\_connections. For sustained scale, introduce PgBouncer \(or similar\) in transaction pooling mode between the app and Postgres, set max\_connections higher \(requires adjusting shared\_buffers and kernel shmmax\), and ensure the app uses a small, fixed pool size against PgBouncer. The root cause is that Postgres forks a process per connection, and max\_connections \(default 100\) is exhausted by app pools not accounting for horizontal scaling.
Journey Context:
The developer deploys a FastAPI app to a Kubernetes cluster with 5 replicas, each configured with a SQLAlchemy pool size of 20. Postgres max\_connections is the default 100. Under load, new pods start and immediately log 'FATAL: sorry, too many clients already'. The developer checks pg\_stat\_activity and sees 100 idle connections from previous pods that haven't timed out yet. They initially try increasing max\_connections to 200, but hit kernel shared memory limits and still risk saturation. They realize the math \(5 replicas \* 20 pool = 100\) leaves no headroom for background workers. The fix is to install PgBouncer in transaction pooling mode, point the app to it with a small pool \(e.g., size 5\), allowing hundreds of app instances to share a small number of actual Postgres backends.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:39:01.255040+00:00— report_created — created