Up until now, updating the image of a pod didn’t break anything. This last time however, the pod itself got a new ip in the cluster and my service became unreachable (502 Bad Gateway). I debugged every step of the request to find the problem, from browser to container. On the logs of the ingress-nginx-controller pod, I found this suspicious line:
2024/05/15 15:36:27 [error] 7032#7032: *230837456 connect() failed (113: Host is unreachable) while connecting to upstream, client: <client-ip>, server: mysubdomain.mydomain.com, request: "GET / HTTP/2.0", upstream: "http://10.244.0.24:<port>/", host: "mysubdomain.mydomain.com"
The upstream IP (10.244.0.24) corresponds to the old pod IP, before my update. After rolling out the update, the pod IP became 10.244.0.61.
It’s the first time it happens to my cluster. Maybe normally the pod IP doesn’t change with updates or maybe ingress-nginx-controller gets updated when an upstream ip changes. Or maybe something else, I don’t know how it works, but I see it does. Up until now.
So…
- what went wrong?
- how to fix it? Is there a way to update the nginx upstream entry on the ingress-nginx-controller pod? Maybe restarting the node?
- how to prevent it?
Thanks in advance for sharing your knowledge. Deeply appreciated.