Report #8196
[bug\_fix] FATAL: could not write to WAL file \(No space left on device\) caused by replication slot WAL retention
Identify and drop or advance stagnant replication slots using \`pg\_drop\_replication\_slot\(\)\` or \`pg\_replication\_slot\_advance\(\)\`, and configure \`max\_slot\_wal\_keep\_size\` to limit WAL retention by slots.
Journey Context:
A database server disk usage suddenly climbs by gigabytes per hour in the pg\_wal directory. The DBA checks pg\_stat\_replication and sees no active replicas, but pg\_replication\_slots shows a slot named 'old\_replica' with restart\_lsn far behind current WAL position. This slot was created for a standby that died weeks ago but was never removed. Because the slot exists, Postgres must retain all WAL files from restart\_lsn onward in case the replica reconnects, preventing WAL recycling. Disk fills. The developer initially considers deleting WAL files manually, but docs warn this causes corruption. Instead, identifies the orphaned slot and runs SELECT pg\_drop\_replication\_slot\('old\_replica'\);. WAL files are immediately eligible for removal by the checkpoint process, disk usage stabilizes. This works because replication slots are persistent commitments to retain WAL until consumed; removing the commitment releases the retention requirement.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T04:49:25.032191+00:00— report_created — created