Guaranteed Nonconvex Low-Rank Tensor Estimation via Scaled Gradient Descent
This addresses a fundamental challenge in signal processing and machine learning for handling multi-dimensional data with corruptions, though it appears incremental as it builds on existing t-SVD frameworks.
The paper tackles the problem of reliably extracting meaningful information from corrupted tensor data, such as missing entries and sparse noise, by developing a scaled gradient descent (ScaledGD) algorithm that achieves linear convergence independent of the condition number for tensor robust PCA and completion, with numerical examples demonstrating accelerated convergence.
Tensors, which give a faithful and effective representation to deliver the intrinsic structure of multi-dimensional data, play a crucial role in an increasing number of signal processing and machine learning problems. However, tensor data are often accompanied by arbitrary signal corruptions, including missing entries and sparse noise. A fundamental challenge is to reliably extract the meaningful information from corrupted tensor data in a statistically and computationally efficient manner. This paper develops a scaled gradient descent (ScaledGD) algorithm to directly estimate the tensor factors with tailored spectral initializations under the tensor-tensor product (t-product) and tensor singular value decomposition (t-SVD) framework. In theory, ScaledGD achieves linear convergence at a constant rate that is independent of the condition number of the ground truth low-rank tensor for two canonical problems -- tensor robust principal component analysis and tensor completion -- as long as the level of corruptions is not too large and the sample size is sufficiently large, while maintaining the low per-iteration cost of gradient descent. To the best of our knowledge, ScaledGD is the first algorithm that provably has such properties for low-rank tensor estimation with the t-SVD decomposition. Finally, numerical examples are provided to demonstrate the efficacy of ScaledGD in accelerating the convergence rate of ill-conditioned low-rank tensor estimation in these two applications.