I have a Kubernetes cronjob that spins up a pod every midnight. There is only 1 container inside. It fetches some data, does some calculations, uploads several files and exits (It is a python 3.10-slim container which is mainly doing git operations: fetches and pulls, which can be memory intensive). The container has resource limits defined, namely:
Limits:
cpu: 300m
memory: 2Gi
Requests:
cpu: 300m
memory: 1Gi
When I check the pods with kubectl get pods, I get the following:
NAME READY STATUS RESTARTS AGE
my-scheduled-pod 0/1 OOMKilled 0 11h
Looking into the pod with kubectl describe pod, it shows that the Status for the pod is “Succeeded”, but for the container I see the following:
State: Terminated
Reason: OOMKilled
Exit Code: 0
Started: Thu, 08 Aug 2024 00:09:03 +0200
Finished: Thu, 08 Aug 2024 01:04:57 +0200
Ready: False
Restart Count: 0
However, everything went according to how it should, the container did finish everything in about an hour and managed to upload the result files and exited successfully. There is no sign of anything which would indicate that it has been terminated, restarted or anything which would not be normal. I checked the pod/container logs with kubectl logs and was up and running during that time.
On the following graph, you can see the container_memory_usage_bytes with yellow, and the container_memory_cache with green:
Container memory usage over time
You can see that it is running with maxed out on memory usage for a while and at some point the the cache drops to 0 shortly after which it is increasing again.
I don’t understand what went wrong here or why I the pod has been labeled as “OOMKilled” when looking at it with kubectl get pods as everything is normal from my perspective and the container has not been restarted nor terminated.