Problem
How can you check if a Kubernetes pod has restarted due to a memory issue or any other issue?
Solution
Follow the steps below to determine whether a pod in your Kubernetes cluster has restarted due to a memory issue.
- Use the following command to gather detailed information about the pod. This will include the status, restart count, and the reason for any previous restarts.
For example:kubectl describe pod <pod_name>
kubectl describe pod nextgen-gw-0
- Analyze the Output, specifically under the Last State section:
vprobe: Container ID: containerd://40c8585cf88dc7d0dd4e43560dc631ef559b0c92e6d5d429719a384aaea77777 Image: us-central1-docker.pkg.dev/opsramp-registry/gateway-cluster-images/vprobe:17.0.0 Image ID: us-central1-docker.pkg.dev/opsramp-registry/gateway-cluster-images/vprobe@sha256:8de1a98c3c14307fa4882c7e7422a1a4e4d507d2bbc454b53f905062b665e9d2 Port: <none> Host Port: <none> State: Running Started: Mon, 29 Jan 2024 12:01:30 +0530 Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Mon, 29 Jan 2024 12:00:42 +0530 Finished: Mon, 29 Jan 2024 12:01:29 +0530 Ready: True Restart Count: 1
- In the Last State section, look for the following indications:
- Reason: The reason listed as
OOMKilled
means the pod was killed because it ran out of memory. - Exit Code: An exit code of
137
indicates the container was forcefully stopped due to an out-of-memory condition.
- Reason: The reason listed as
Conclusion
If the pod’s last state shows Reason: OOMKilled
and Exit Code: 137
, this confirms that the pod was restarted due to insufficient memory (Out of Memory or OOM condition). You may need to allocate more memory resources to this pod or optimize the workload to prevent this from happening again.