NIMar 13

Linnaeus: A Hierarchical, Multi-Label Framework for Autonomous System Classification

Marcos Piotto, Ignacio Schuemer, Santiago T. Torres, Mariano G. Beiró, Esteban Carisimo, Fabián E. Bustamante

arXiv:2603.1364911.4h-index: 7

AI Analysis

This work addresses the need for more accurate and nuanced classification of Internet ASes for network researchers and operators, though it is incremental as it builds on existing methods by integrating new data sources and techniques.

The paper tackles the problem of classifying autonomous systems (ASes) in the Internet by addressing the limitations of existing taxonomies that fail to capture semantic and operational heterogeneity, resulting in a framework called Linnaeus that achieves an overall precision of 0.83 and recall of 0.76 on a dataset of nearly 2,000 ASes.

Autonomous systems (ASes) play diverse roles in today's Internet, from community and research backbones to hyperscale content providers and submarine-cable operators. However, existing taxonomies based solely on network-level features fail to capture their semantic and operational heterogeneity. In this paper, we present Linnaeus, a hierarchical AS-classification framework that combines network-centric data (e.g., topology, BGP announcements) with rich non-network features and leverages domain-adapted large language models alongside traditional machine-learning techniques. Linnaeus provides a two-level taxonomy with 18 top-level and 38 second-level classes, supports multi-label assignments to reflect hybrid roles (e.g., research backbone and transit provider), and provides an end-to-end pipeline from data ingestion to label inference. On a manually annotated dataset of nearly 2,000 ASes, Linnaeus achieves an overall precision and recall of 0.83 and 0.76, respectively. We further demonstrate its practical value through case studies, highlighting Linnaeus's potential to reveal both structural and semantic dimensions of Internet infrastructure.

View on arXiv PDF

Similar