MLLGSPApr 6

A Muon-Accelerated Algorithm for Low Separation Rank Tensor Generalized Linear Models

arXiv:2604.047267.3
Predicted impact top 52% in ML · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses a computational bottleneck for researchers and practitioners in fields like biomedical imaging who use tensor-valued data, though it is incremental as it builds on an existing algorithm.

The paper tackled the computational inefficiency of estimating low separation rank tensor generalized linear models by proposing LSRTR-M, which incorporates Muon updates to replace projection steps, resulting in faster convergence and lower errors in synthetic tasks and improved efficiency on a 3D classification task.

Tensor-valued data arise naturally in multidimensional signal and imaging problems, such as biomedical imaging. When incorporated into generalized linear models (GLMs), naive vectorization can destroy their multi-way structure and lead to high-dimensional, ill-posed estimation. To address this challenge, Low Separation Rank (LSR) decompositions reduce model complexity by imposing low-rank multilinear structure on the coefficient tensor. A representative approach for estimating LSR-based tensor GLMs (LSR-TGLMs) is the Low Separation Rank Tensor Regression (LSRTR) algorithm, which adopts block coordinate descent and enforces orthogonality of the factor matrices through repeated QR-based projections. However, the repeated projection steps can be computationally demanding and slow convergence. Motivated by the need for scalable estimation and classification from such data, we propose LSRTR-M, which incorporates Muon (MomentUm Orthogonalized by Newton-Schulz) updates into the LSRTR framework. Specifically, LSRTR-M preserves the original block coordinate scheme while replacing the projection-based factor updates with Muon steps. Across synthetic linear, logistic, and Poisson LSR-TGLMs, LSRTR-M converges faster in both iteration count and wall-clock time, while achieving lower normalized estimation and prediction errors. On the Vessel MNIST 3D task, it further improves computational efficiency while maintaining competitive classification performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes