Martha Cash

2papers

2 Papers

2.2NIApr 28
On the Role of Time Series Clustering in Traffic Matrix Prediction

Martha Cash, Charlotte Fowler, Alexander M. Wyglinski

This paper analyzes the role of time-series clustering in traffic matrix (TM) prediction. Traffic flows within a TM often exhibit heterogeneous behavior, which can reduce the effectiveness of global forecasting models that predict all flows jointly. To address this, we propose a clustering-based prediction framework that groups flows into smaller subsets and trains separate predictors for each group. Four traffic-flow representations for clustering are explored, namely, histogram, autocorrelation function (ACF), power spectral density (PSD), and naïve partitioning, and how the representation choice and the number of clusters affect prediction performance. Experiments using the publicly available Abilene and GÉANT datasets show that clustering consistently improves over global forecasting baselines, while remaining substantially less costly than local prediction. The results further show that most of the performance gain is achieved at moderate values of K, with diminishing returns as the number of clusters increases. Although different clustering representations produce different partitions of the traffic flows, they often achieve similar root mean squared error (RMSE). This suggests that the main benefit of clustering lies in decomposing the TM prediction task into smaller subproblems, while the exact cluster structure plays a more limited role in determining overall prediction accuracy.

LGSep 18, 2025
Improving Internet Traffic Matrix Prediction via Time Series Clustering

Martha Cash, Alexander Wyglinski

We present a novel framework that leverages time series clustering to improve internet traffic matrix (TM) prediction using deep learning (DL) models. Traffic flows within a TM often exhibit diverse temporal behaviors, which can hinder prediction accuracy when training a single model across all flows. To address this, we propose two clustering strategies, source clustering and histogram clustering, that group flows with similar temporal patterns prior to model training. Clustering creates more homogeneous data subsets, enabling models to capture underlying patterns more effectively and generalize better than global prediction approaches that fit a single model to the entire TM. Compared to existing TM prediction methods, our method reduces RMSE by up to 92\% for Abilene and 75\% for GÉANT. In routing scenarios, our clustered predictions also reduce maximum link utilization (MLU) bias by 18\% and 21\%, respectively, demonstrating the practical benefits of clustering when TMs are used for network optimization.