Recently I was doing some jobs in the home lab and at some point I was trying to add a new vSphere host called nuc01.cloudhat.local into a vCenter 8 cluster, but I immediately received an error:
Cannot contact host nuc01.cloudhat.local
This was a new vCenter 8 environment and a new vSphere host. I started to debug the issue using my Windows jump VM. The new vSphere host was responding to ping using the IP address, but DNS resolution was not working. I switched to my DNS server and of course, DNS records were missing for the new host. I added the missing DNS records and then I switched back to my Windows VM to check DNS resolution:
C:\>nslookup nuc01.cloudhat.local
Server: dns.cloudhat.local
Address: 192.168.0.222
Name: nuc01.cloudhat.local
Address: 192.168.0.111
This time it was looking good, so I went back to vCenter to retry the “Add Hosts” to cluster operation, just to receive same error: “Cannot contact host”.
Debugging “Cannot contact host” error
Time to SSH into the vCenter 8 instance, to see what’s happening.
Does DNS resolution for the new server work? No.
root@vcenter [ ~ ]# host nuc01.cloudhat.local
Host nuc01.cloudhat.local not found: 3(NXDOMAIN)
Does DNS resolution for other server work? Yes.
root@vcenter [ ~ ]# host esx0.cloudhat.local
esx0.cloudhat.local has address 192.168.0.100
Do we use the right DNS server? Yes.
root@vcenter [ ~ ]# cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.
nameserver 127.0.0.1
nameserver 192.168.0.222
At this moment of my investigation I started to suspect a DNS cache issue on vCenter. A quick check did not revealed anything relevant for vCenter 8, but I was able to find something similar for vCenter 6.7:
Restart the dnsmasq service to flush the old cache data.
https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcsa.doc/GUID-56C3BA9A-234E-4D81-A4BC-E2A37892A854.html
- Connect to the vCenter Server Appliance using SSH.
- Change the BASH shell by entering the shell command.
- Run service dnsmasq restart to restart the dnsmasq service.
Time to check if this command still works on vCenter 8:
root@vcenter [ ~ ]# service dnsmasq restart
root@vcenter [ ~ ]# host nuc01.cloudhat.local
nuc01.cloudhat.local has address 192.168.0.111
Yes, it does work. Let’s move back to vSphere to confirm that we no longer receive the error “Cannot contact host” when running “Add Hosts” command:
As a summary, quick tip: if you change DNS records for vSphere hosts soon after vCenter was trying to do a DNS query, you need to flush the old DNS cache from vCenter.