Guaranteed Simultaneous Asymmetric Tensor Decomposition via Orthogonalized Alternating Least Squares
This provides a theoretical guarantee for simultaneous orthogonal asymmetric tensor decomposition, addressing a bottleneck in machine learning applications where existing methods lack such guarantees.
The paper tackles the problem of simultaneously recovering the top r components in tensor CP decomposition, proposing a Slicing Initialized Alternating Subspace Iteration (s-ASI) method that guarantees ε-close recovery almost surely under noiseless conditions and with high probability under bounded noise, using O(log(log(1/ε))) iteration steps.
Tensor CANDECOMP/PARAFAC (CP) decomposition is an important tool that solves a wide class of machine learning problems. Existing popular approaches recover components one by one, not necessarily in the order of larger components first. Recently developed simultaneous power method obtains only a high probability recovery of top $r$ components even when the observed tensor is noiseless. We propose a Slicing Initialized Alternating Subspace Iteration (s-ASI) method that is guaranteed to recover top $r$ components ($ε$-close) simultaneously for (a)symmetric tensors almost surely under the noiseless case (with high probability for a bounded noise) using $O(\log(\log \frac{1}ε))$ steps of tensor subspace iterations. Our s-ASI method introduces a Slice-Based Initialization that runs $O(1/\log(\frac{λ_r}{λ_{r+1}}))$ steps of matrix subspace iterations, where $λ_r$ denotes the r-th top singular value of the tensor. We are the first to provide a theoretical guarantee on simultaneous orthogonal asymmetric tensor decomposition. Under the noiseless case, we are the first to provide an \emph{almost sure} theoretical guarantee on simultaneous orthogonal tensor decomposition. When tensor is noisy, our algorithm for asymmetric tensor is robust to noise smaller than $\min\{O(\frac{(λ_r - λ_{r+1})ε}{\sqrt{r}}), O(δ_0\frac{λ_r -λ_{r+1}}{\sqrt{d}})\}$, where $δ_0$ is a small constant proportional to the probability of bad initializations in the noisy setting.