I’m experiencing an intermittent UnknownHostException in my Kubernetes cluster (GKE). The issue occurs with some microservices (built with Spring Boot 2.7.RELEASE using the eclipse-temurin:17-jre base image) when trying to request another service through its Kubernetes service name.
Environment:
Kubernetes version: 1.26 (Google Cloud)
Spring Boot version: 2.7.RELEASE
Java base image: eclipse-temurin:17-jre
Issue:
Occasionally, some services fail to resolve the hostname of other services and throw an UnknownHostException. Here is the relevant log snippet:
Exception caught. Returning HTTP status 500 INTERNAL_SERVER_ERROR.
org.springframework.web.client.ResourceAccessException: I/O error on GET request for "http://kantoor:8097/api/v1.0/configTopic/king": kantoor; nested exception is java.net.UnknownHostException: kantoor
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:784)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:710)
at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:601)
Troubleshooting Steps Taken:
- Curl Tests: Ran over 10,000 curl requests from one service to another using the Kubernetes service name, all succeeded.
kubectl exec -it <pod> -n my-ns -- bash
curl http://kantoor:8081/api/health
- Monitoring kube-dns: Checked resource usage and resource limits, no issues found
Additional Information:
Kube-DNS Logs Sample
I0511 02:08:39.807635 1 server.go:126] FLAG: --nameservers=""
I0511 02:08:39.807682 1 server.go:126] FLAG: --one-output="false"
I0511 02:08:39.807719 1 server.go:126] FLAG: --profiling="false"
I0511 02:08:39.807783 1 server.go:126] FLAG: --skip-headers="false"
I0511 02:08:39.807841 1 server.go:126] FLAG: --skip-log-headers="false"
I0511 02:08:39.807887 1 server.go:126] FLAG: --stderrthreshold="2"
I0511 02:08:39.807924 1 server.go:126] FLAG: --v="2"
I0511 02:08:39.807981 1 server.go:126] FLAG: --version="false"
I0511 02:08:39.808037 1 server.go:126] FLAG: --vmodule=""
I0511 02:08:39.812591 1 server.go:182] Starting SkyDNS server (0.0.0.0:10053)
I0511 02:08:39.812747 1 server.go:194] Skydns metrics enabled (/metrics:10055)
I0511 02:08:39.812764 1 dns.go:190] Starting endpointsController
I0511 02:08:39.812769 1 dns.go:193] Starting serviceController
I0511 02:08:39.812882 1 dns.go:186] Configuration updated: {TypeMeta:{Kind: APIVersion:} Federations:map[] StubDomains:map[] UpstreamNameservers:[]}
I0511 02:08:39.813722 1 log.go:245] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0511 02:08:39.813737 1 log.go:245] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0511 02:08:40.313285 1 dns.go:224] Initialized services and endpoints from apiserver
I0511 02:08:40.313312 1 server.go:150] Setting up Healthz Handler (/readiness)
I0511 02:08:40.313319 1 server.go:155] Setting up cache handler (/cache)
I0511 02:08:40.313338 1 server.go:136] Status HTTP port 8081
Despite all these checks and adjustments, the issue persists intermittently. Any insights or suggestions on what might be causing this intermittent UnknownHostException would be greatly appreciated.
Maulik Prajapati is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.