Report #24047
[bug\_fix] Postgres replication lag causing stale reads on hot standby
Set synchronous\_commit to remote\_apply on the primary for critical transactions, forcing the primary to wait for the standby to apply the WAL before acknowledging commit; alternatively, route read-after-write operations to the primary or check pg\_stat\_replication.replay\_lag before reading. Root cause: asynchronous streaming replication allows the standby to lag behind the primary; immediately reading from the replica after a write returns pre-transaction state.
Journey Context:
A SaaS application uses PostgreSQL 15 with one primary and one hot standby for read scaling. The web frontend writes a user profile update to the primary, then redirects to a profile page that queries the standby. Approximately 5% of requests show the old data for 1-3 seconds. The developer initially suspects caching \(Redis\) and flushes keys, but the stale data persists. Checking pg\_stat\_replication reveals a replay\_lag of 2-3 seconds during peak load. After reading the synchronous replication docs, the team decides that forcing synchronous\_commit = remote\_apply for all writes would add too much latency. Instead, they modify the ORM to use the primary connection for any read within 5 seconds of a write to that user's session, eliminating the observed stale reads while maintaining read scaling for unrelated traffic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T18:46:22.339509+00:00— report_created — created