MLJun 17, 2014

DFacTo: Distributed Factorization of Tensors

arXiv:1406.4519v1163 citations

Originality Incremental advance

AI Analysis

This work addresses computational bottlenecks in tensor factorization for applications like data analysis, but it is incremental as it improves existing methods rather than introducing a new paradigm.

The paper tackles the problem of speeding up Alternating Least Squares (ALS) and Gradient Descent (GD) for tensor factorization by exploiting properties of the Khatri-Rao product, resulting in DFacTo, which is 4 to 10 times faster on average than competing algorithms, e.g., taking 480 seconds on 4 machines for one ALS iteration on a large tensor.

We present a technique for significantly speeding up Alternating Least Squares (ALS) and Gradient Descent (GD), two widely used algorithms for tensor factorization. By exploiting properties of the Khatri-Rao product, we show how to efficiently address a computationally challenging sub-step of both algorithms. Our algorithm, DFacTo, only requires two sparse matrix-vector products and is easy to parallelize. DFacTo is not only scalable but also on average 4 to 10 times faster than competing algorithms on a variety of datasets. For instance, DFacTo only takes 480 seconds on 4 machines to perform one iteration of the ALS algorithm and 1,143 seconds to perform one iteration of the GD algorithm on a 6.5 million x 2.5 million x 1.5 million dimensional tensor with 1.2 billion non-zero entries.

View on arXiv PDF

Similar