Classifying Network Vendors at Internet Scale
This work addresses the need for large-scale network device vendor classification to understand Internet infrastructure trends, but it is incremental as it builds on existing scanning and clustering methods.
The paper tackled the problem of identifying network device vendors at Internet scale by creating a labeled dataset of over 160,000 devices using scanning and clustering, and trained a classifier to predict vendors, which was applied to analyze trends in traceroute data.
In this paper, we develop a method to create a large, labeled dataset of visible network device vendors across the Internet by mapping network-visible IP addresses to device vendors. We use Internet-wide scanning, banner grabs of network-visible devices across the IPv4 address space, and clustering techniques to assign labels to more than 160,000 devices. We subsequently probe these devices and use features extracted from the responses to train a classifier that can accurately classify device vendors. Finally, we demonstrate how this method can be used to understand broader trends across the Internet by predicting device vendors in traceroutes from CAIDA's Archipelago measurement system and subsequently examining vendor distributions across these traceroutes.