Report #61323
[bug\_fix] Postgres replication slot limit causing WAL bloat
Query pg\_replication\_slots to identify inactive slots \(active = false\) and drop them with SELECT pg\_drop\_replication\_slot\('slot\_name'\); alternatively increase max\_replication\_slots \(requires restart\) if slots are genuinely needed. Root cause: max\_replication\_slots defaults to 10; once full, logical decoding slots \(e.g., for Debezium\) cannot be created; more critically, existing inactive slots prevent WAL cleanup, causing unbounded disk growth in pg\_wal/.
Journey Context:
A Kubernetes operator notices the Postgres primary's EBS volume is growing 50GB/day. Investigation shows pg\_wal/ contains 500GB of files despite archiving being disabled. Querying pg\_replication\_slots reveals 10 slots, 8 named 'debezium\_old\_version\_\*' with active = false. These are from previous connector deployments that didn't clean up. The DBA drops the stale slots; WAL files are immediately recycled by the checkpoint process. They implement monitoring for active = false slots older than 24h and adjust the Debezium config to drop slots on shutdown.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:24:59.240387+00:00— report_created — created