I am trying to deploy 200 springboot dynamic pods (created from job object) from another springboot pod (created from deployment object). Each dynamic has CPU and Memory limits of 500m and 1.3Gb respectively.
A set of 10 pods will be created at a time and there will be a 2 min delay before creating another set of 10 pods. When the total number of pods reach around 100, some of the dynamic pods (around 3 to 5 pods) failed to resolve the service dns of another pod and throws jave.net.unknownhostexception.
org.springframework.web.client.ResourceAccessException: I/O error on POST request for "
http://app-svc.namespace:8000/api/management/v1/fetch-gnes-snes":
app-svc.namespace.; nested exception is java.net.UnknownHostException: app-svc.namespace.
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:791)
at org.springframework.security.oauth2.client.OAuth2RestTemplate.doExecute(OAuth2RestTemplate.java:138)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:717)
at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:608)
at com.fujitsu.fnc.fums.session_manager.repo.RestCallImpl.getGNEAndSNEDetails(RestCallImpl.java:189)
I am able to reproduce this issue every time I am trying to create the 200 pods and the unknownhostexception issue occurs in 3 to 5 pods when the total number of pods reach around 100.
Sometimes I see an i/o timeout error in the coredns pods. Am not sure if this is triggered due to the unknownhostexception encountered in the dynamic springboot pods.
Can anyone please explain the reason for this behavior? Also how to address this issue?
NOTE: We never hit the unknownhostexception when we created 5 pods at a time with the same 2 min delay
Cluster information:
Kubernetes version: 1.23
Cloud being used: Bare-Metal VMs created from VMWare
Installation method: Air Gapped RKE2 Installation
Host OS: RHEL 8.6
CNI and version: calico - v3.22.1
CRI and version: crictl - v1.23.0
3 - Control Nodes (54GB RAM and 24 core CPU)
15 - Compute Nodes (54GB RAM and 24 core CPU)