Report #2606
[bug\_fix] OOMKilled container with exit code 137 and Status.reason OOMKilled
Increase the container's memory limit \(and request if needed\), or profile and fix the memory leak; for Java workloads also ensure heap/container limits are aligned.
Journey Context:
A workload runs fine under light traffic but pods restart under load. kubectl get pod shows RESTARTS climbing and Last State: Terminated Reason: OOMKilled Exit Code: 137. dmesg or node logs show Memory cgroup out of memory: Killed process. The developer first adds replicas, but each pod still dies. The root cause is that the cgroup limit set by resources.limits.memory is lower than the process's peak RSS; when the limit is crossed the kernel OOM killer terminates the container. The fix is to raise the limit to a value supported by benchmarking, and to set a request close to the steady-state usage. If usage grows unbounded, the fix is a code-level leak repair. For JVM apps this often means setting -XX:MaxRAMPercentage alongside the container limit so the heap does not exceed the cgroup boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T13:27:48.574781+00:00— report_created — created