Report #97757
[bug\_fix] OOMKilled
Run \`kubectl describe pod\` to confirm \`Reason: OOMKilled\` and inspect memory usage with \`kubectl top pod\` or node metrics. Increase the container's \`resources.limits.memory\` \(and \`requests.memory\` proportionally\) after profiling the workload, or fix the memory leak causing the spike. If the workload is bursty, consider Vertical Pod Autoscaler.
Journey Context:
A Python data-processing pod is killed every few hours. \`kubectl describe pod\` shows \`Last State: Terminated Reason: OOMKilled Exit Code: 137\`, and \`kubectl top pod\` shows memory climbing right before death. You profile locally and find a Pandas DataFrame being duplicated in a loop. After raising the Deployment's memory limit from 512Mi to 2Gi and fixing the duplication, the pod survives peak load. OOMKilled happens because the Linux OOM killer terminates the container when its cgroup memory limit is exceeded; giving it more headroom or reducing consumption stops the killer from firing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T04:38:59.059129+00:00— report_created — created