Report #17822
[gotcha] Redis/PostgreSQL/MySQL exhibits random 500ms-2s latency spikes under memory pressure with no swap usage
Disable Transparent HugePages \(THP\) by setting \`echo never > /sys/kernel/mm/transparent\_hugepage/enabled\` at boot via systemd service or kernel parameter \`transparent\_hugepage=never\`, and verify with \`cat /sys/kernel/mm/transparent\_hugepage/enabled\` showing \`\[never\]\`
Journey Context:
THP attempts to coalesce 4KB pages into 2MB pages to reduce TLB misses. When memory is fragmented, the kernel's \`khugepaged\` daemon performs compaction involving memory migration and page locking. For databases with large heaps \(Redis\) or shared buffers \(PostgreSQL\), this compaction pauses all threads accessing those pages, causing multi-second stalls resembling GC pauses. The setting 'madvise' allows opt-in via \`madvise\(MADV\_HUGEPAGE\)\`, but databases don't use this, making 'never' safe. The tradeoff is slightly higher CPU for TLB management \(negligible\) vs elimination of unpredictable multi-second pauses. Most managed database services \(ElastiCache, RDS\) disable THP by default, but self-managed instances require explicit disabling.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T06:25:35.594453+00:00— report_created — created