Toeplitz Least Squares Problems, Fast Algorithms and Big Data
This work addresses computational efficiency for statisticians and data scientists dealing with large-scale time series analysis, though it is incremental as it compares existing algorithms rather than introducing new ones.
The paper tackled the computational bottleneck of solving Toeplitz least squares problems for autoregressive model fitting in big time-series data by comparing two randomized algorithms (LSAR and Repeated Halving). It found that both algorithms performed comparably on synthetic datasets, but LSAR was more robust on real-world data, demonstrating the effectiveness of randomized numerical linear algebra in this context.
In time series analysis, when fitting an autoregressive model, one must solve a Toeplitz ordinary least squares problem numerous times to find an appropriate model, which can severely affect computational times with large data sets. Two recent algorithms (LSAR and Repeated Halving) have applied randomized numerical linear algebra (RandNLA) techniques to fitting an autoregressive model to big time-series data. We investigate and compare the quality of these two approximation algorithms on large-scale synthetic and real-world data. While both algorithms display comparable results for synthetic datasets, the LSAR algorithm appears to be more robust when applied to real-world time series data. We conclude that RandNLA is effective in the context of big-data time series.