DS LGNov 21, 2024

Overcomplete Tensor Decomposition via Koszul-Young Flattenings

Pravesh K. Kothari, Ankur Moitra, Alexander S. Wein

arXiv:2411.14344v21.24 citationsh-index: 6FOCS

Originality Highly original

AI Analysis

This addresses tensor decomposition for algebraic complexity and machine learning, offering incremental algorithmic improvements with theoretical bounds.

The paper tackles the problem of decomposing tensors into a minimal number of rank-1 terms with certified uniqueness, using Koszul-Young flattenings to achieve a guaranteed success for ranks up to (1-ε)(n₂ + n₃) under generic conditions, improving over prior methods by a factor of 2 in some cases.

Motivated by connections between algebraic complexity lower bounds and tensor decompositions, we investigate Koszul-Young flattenings, which are the main ingredient in recent lower bounds for matrix multiplication. Based on this tool we give a new algorithm for decomposing an $n_1 \times n_2 \times n_3$ tensor as the sum of a minimal number of rank-1 terms, and certifying uniqueness of this decomposition. For $n_1 \le n_2 \le n_3$ with $n_1 \to \infty$ and $n_3/n_2 = O(1)$, our algorithm is guaranteed to succeed when the tensor rank is bounded by $r \le (1-ε)(n_2 + n_3)$ for an arbitrary $ε> 0$, provided the tensor components are generically chosen. For any fixed $ε$, the runtime is polynomial in $n_3$. When $n_2 = n_3 = n$, our condition on the rank gives a factor-of-2 improvement over the classical simultaneous diagonalization algorithm, which requires $r \le n$, and also improves on the recent algorithm of Koiran (2024) which requires $r \le 4n/3$. It also improves on the PhD thesis of Persu (2018) which solves rank detection for $r \leq 3n/2$. We complement our upper bounds by showing limitations, in particular that no flattening of the style we consider can surpass rank $n_2 + n_3$. Furthermore, for $n \times n \times n$ tensors, we show that an even more general class of degree-$d$ polynomial flattenings cannot surpass rank $Cn$ for a constant $C = C(d)$. This suggests that for tensor decompositions, the case of generic components may be fundamentally harder than that of random components, where efficient decomposition is possible even in highly overcomplete settings.

View on arXiv PDF

Similar