Recently, I encountered a situation where employees working at home while connected to our corporate VPN were unable to access a certain network resource which would normally be accessible only to on-site and VPN users. Let’s call that resource importantserver.example.com.
importantserver.test.com has an RFC1918 address, let’s give 10.51.0.7 as an example. This address is only accessible internally, so there should be no public DNS entry for importantserver.test.com. The only clients that should be able to receive the IP address of this server via a DNS request should be users on the internal network, whether on-site or through a VPN, who are querying the company DNS nameserver.
This should mean:
- On-campus/VPN-connected computer DNS query for importantserver.test.com:
- result: A 10.51.0.7
- Any other system’s DNS query for importantserver.test.com:
- result: no result found
This is all fine, because when a remote user is connected to the VPN server, their computer is supposed to be provided with a list of available DNS servers to service requests. From this list, Windows will first try resolving domain names using the DNS server of the interface with the lowest interface metric, which has generally been the most recent connection – the VPN connection. We can see this with the Get-DNSClientServerAddress PowerShell command:
In this example, we can see that the DNS server associated with the VPN has a metric of 72, higher than the metrics of the DNS servers provided by the Wi-Fi and Ethernet interfaces, so Windows uses this server only if it receives no results from the servers of interfaces with lower metrics.
Unfortunately, the Automatic Metric feature in Windows does not seem to be set up for consistent DNS resolution. In most cases, the most recently-established connection receives low-enough metric that the VPN’s DNS servers are used, but this is not always the case. However, this shouldn’t be a problem, because if Windows consults the user’s LAN’s DNS server and doesn’t find the result, it should move on and try the VPN’s DNS server, right?
That’s what I thought, but I was wrong. Windows only tries a subsequent server if it receives no response at all from the first DNS server. However, in this case, it was receiving an NXDOMAIN response, indicating that the domain name importantserver.test.com does not exist. Despite this response not helping the user, it is nonetheless a response, so Windows does not continue to try the next DNS server… instead, the DNS lookup simply fails.
A user disconnecting and reconnecting the VPN would usually get things working in a manner where Windows used the VPN’s DNS server, but not always. About a year ago, there were several incidents where remote workers were unable to access importantserver.test.com. The server was accessible by IP address, but the remote worker’s laptop would only perform DNS lookups using the users’ normal public DNS servers, never using the VPN connection’s DNS servers. The quick fix that was put in place was to add the private RFC1918 address of the server, 10.51.0.7, to the public DNS records. This allowed clients with unpredictable DNS resolution tactics to resolve the private IP of this server.
No one ever followed through on resolving the issue with the remote worker’s computer’s DNS settings, which presumably could have been done by toying around with the Interface Metric settings of the VPN connection, which was deployed via Group Policy. So, since that time, remote workers’ computers have been happily resolving the IP address of importantserver.test.com using either the public DNS system, or the private DNS server provided by the VPN, both of which would provide valid responses to the queries.
However, in December, something changed. I received a complaint from a user indicating that they could not connect to importantserver.test.com from their remote location via the VPN. When performing an nslookup on their system, I saw that the nameserver being consulted was that of their ISP, which was not returning any address. I did notice that if I manually specified to query one of the enterprise’s DNS servers, or even any major public DNS service (188.8.131.52, 184.108.40.206, etc.), the correct result would be returned. As a temporary solution, I set the user’s device to use CloudFlare 220.127.116.11 as their DNS server, until I had time to further investigate. However, within the next week, I received two more tickets from users with the same issue, both of whom were using the same ISP. When performing an nslookup on their IP’s DNS servers, the information returned was not the expected IP address (10.51.0.7), but instead the SOA record indicating the nameservers responsible for handling the request… let’s call them ns1.test.com and ns2.test.com :
- DNS query for importantserver.test.com on public DNS servers:
- result: A 10.51.0.7
- DNS query for importantserver.test.com on the servers of the French ISP Free:
- result: Served by:
- result: Served by:
Interestingly, manually sending the query directly to one of these nameservers (i.e. nslookup importantserver.test.com ns1.test.com) would return the correct result, but it seemed that the recursion process of the ISP’s DNS resolvers was not resolving the last step.
This continued on for a few weeks without any progress: we had a temporary fix, and more important things that needed our attention. In any case, I am not responsible for network administration at this job, so I wasn’t able to check the DNS nameserver configuration, but given that most public DNS servers were able to resolve the IP address of the hostname, the DNS server must have been correctly set up…
…and then it came to me. At home, I use a pfSense firewall, and I remember having seen options in the configuration to block RFC1918 addresses from entering. What if there was a similar option to block RFC1918 addresses from being resolved? (My box runs a recursive resolver, instead of using a public caching DNS server.) I checked in the configuration of the resolver at home, and found that:
- When performing a DNS lookup, I did not receive the RFC1918 IP address for importantserver.test.com, instead receiving the NS records – the same result received by the remote workers who used the default DNS servers of their ISP, Free.
- The default configuration of my DNS resolver, unbound, contained the following lines:
After commenting out the line containing 10.0.0.0/8 and restarting the DNS service, I was able to successfully resolve the RFC1918 IP address for the server.
So, my guess is that around December, Free changed their DNS servers to a configuration that also restricted resolution of RFC1918 addresses. The servers recursively resolve the DNS chain until reaching the NS record, then when they receive the result containing an RFC1918 address, they throw out the result and simply return the previous resolution, the NS record, to the client.
In this case, there is no magic bullet, except to fix the computers’ VPN settings to force the computers to always prioritise the VPN’s DNS server, OR to manually configure the client computer to use a public DNS server that does resolves RFC1918 addresses without a problem, although such behaviour is not guaranteed.
It’s a bit late for it, but Happy New Year! May all your resolutions be successful. 😉