Report #100531
[bug\_fix] Service DNS resolution timeout or failure
Exec into a debug pod in the same namespace and run \`cat /etc/resolv.conf\` to confirm the cluster DNS IP and \`search\` suffixes. Test with \`nslookup kubernetes.default.svc.cluster.local.\`. If external names fail but internal names work, the cause is often \`options ndots:5\` forcing short names through every search suffix; use fully-qualified hostnames ending in a dot \(e.g. \`api.example.com.\`\) or set \`ndots:1\` in the pod's DNS config. If cluster names also fail, check that CoreDNS pods and the \`kube-dns\` Service endpoints are healthy.
Journey Context:
An application could resolve \`database.default.svc.cluster.local\` but any outbound HTTPS call to \`api.stripe.com\` timed out after five seconds. The developer initially suspected a NetworkPolicy. Inside the pod, \`cat /etc/resolv.conf\` showed \`search default.svc.cluster.local svc.cluster.local cluster.local ...\` and \`options ndots:5\`. Because the application used \`api.stripe.com\` without a trailing dot, glibc treated it as a non-FQDN query and tried \`api.stripe.com.default.svc.cluster.local\`, then \`api.stripe.com.svc.cluster.local\`, and so on, before finally querying the real name. Each miss took roughly one second and the application gave up. Using \`api.stripe.com.\` \(with trailing dot\) in the configuration made the lookup instantaneous.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T04:40:06.695440+00:00— report_created — created