Report #87017
[bug\_fix] OOMKilled: container exceeded its memory limit and was killed by the kernel
Increase the container's \`resources.limits.memory\` \(and \`requests.memory\` proportionally\) after profiling actual usage. If the spike is transient, also tune the workload; if it is a leak, fix the application. Check \`kubectl top pod\` and node metrics to confirm the new limit fits within node capacity.
Journey Context:
A Python worker pod restarts every few hours. \`kubectl describe pod\` shows Last State: Terminated, Reason: OOMKilled, Exit Code: 137. You check \`kubectl top pod\` and the memory graph climbs steadily until it hits the 512Mi limit, then the kernel OOM killer terminates the process. The container had no swap and the cgroup limit is hard. The app processes a large queue and loads entire messages into memory. You raise the limit to 1Gi and add a request of 800Mi so the scheduler places it on a node with enough headroom. The pod now survives the workload batch and the restarts stop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:38:54.472504+00:00— report_created — created