Large-Scale Characterization and Segmentation of Internet Path Delays with Infinite HMMs
This work addresses the need for efficient, automated processing of network performance data for researchers and operators, though it is incremental as it applies an existing statistical model to a new domain.
The authors tackled the problem of automating the analysis of large-scale Internet delay measurements by introducing an infinite hidden Markov model (HDP-HMM) for segmentation, achieving results very close to human cognition and demonstrating high accuracy on labeled datasets and real-world data from RIPE Atlas and CAIDA MANIC.
Round-Trip Times are one of the most commonly collected performance metrics in computer networks. Measurement platforms such as RIPE Atlas provide researchers and network operators with an unprecedented amount of historical Internet delay measurements. It would be very useful to automate the processing of these measurements (statistical characterization of paths performance, change detection, recognition of recurring patterns, etc.). Humans are pretty good at finding patterns in network measurements but it can be difficult to automate this to enable many time series being processed at the same time. In this article we introduce a new model, the HDP-HMM or infinite hidden Markov model, whose performance in trace segmentation is very close to human cognition. This is obtained at the cost of a greater complexity and the ambition of this article is to make the theory accessible to network monitoring and management researchers. We demonstrate that this model provides very accurate results on a labeled dataset and on RIPE Atlas and CAIDA MANIC data. This method has been implemented in Atlas and we introduce the publicly accessible Web API.