OCTen: Online Compression-based Tensor Decomposition
This addresses the need for scalable tensor decomposition in data mining and machine learning for large, time-varying datasets, representing an incremental improvement with novel compression techniques.
The paper tackles the problem of decomposing dynamically growing tensors by introducing OCTen, the first compression-based online parallel implementation for CP decomposition, which achieves comparable or better accuracy and efficiency than state-of-the-art methods while saving 40-200% memory space.
Tensor decompositions are powerful tools for large data analytics as they jointly model multiple aspects of data into one framework and enable the discovery of the latent structures and higher-order correlations within the data. One of the most widely studied and used decompositions, especially in data mining and machine learning, is the Canonical Polyadic or CP decomposition. However, today's datasets are not static and these datasets often dynamically growing and changing with time. To operate on such large data, we present OCTen the first ever compression-based online parallel implementation for the CP decomposition. We conduct an extensive empirical analysis of the algorithms in terms of fitness, memory used and CPU time, and in order to demonstrate the compression and scalability of the method, we apply OCTen to big tensor data. Indicatively, OCTen performs on-par or better than state-of-the-art online and online methods in terms of decomposition accuracy and efficiency, while saving up to 40-200 % memory space.