Not All Roads Lead to Rome: How VPN Selection Alters What We Measure and Infer about Web Infrastructure
This work is significant for researchers conducting web-measurement studies, as it challenges the common assumption that VPNs within a country are interchangeable vantage points, potentially leading to inaccurate inferences about web infrastructure.
This paper demonstrates that different commercial VPN providers within the same country yield materially different conclusions about web infrastructure, including endpoint location, hosting provider, and physical replicas. The variability is primarily driven by the VPN's in-country DNS infrastructure, CDN steering based on the exit network, and peering paths.
Web-measurement studies treat commercial VPNs as interchangeable vantage points within a country, assuming that any VPN in a particular country is as good as any other. We show that this assumption does not hold: the same country measured through different VPN providers yields materially different conclusions about where endpoints sit, who hosts them, and which physical replicas serve them. Using large-scale browser-based measurements across fourteen countries and four major VPN providers, complemented by targeted DNS and replica-selection probes, we examine sources of this variability across three layers of the VPN-to-endpoint path: vantage identity, name resolution, and replica selection. We find that the variability is driven primarily by layers below the client: commercial VPN providers operate their own in-country DNS infrastructure, often intercepting queries regardless of client configuration; CDNs steer on the exit network, sending identical queries to different replicas; and peering paths route identical DNS answers to different physical facilities. We distill these findings into a set of reporting practices for VPN-based Web measurement.