OCDec 10, 2025
The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix OptimizationAlexey Kravatskiy, Ivan Kozyrev, Nikolai Kozlov et al.
In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we leverage duals of the Ky Fan $k$-norms to introduce a family of Muon-like algorithms we name Fanions, which are closely related to Dion. By working with duals of convex combinations of the Ky Fan $k$-norms with either the Frobenius norm or the $l_\infty$ norm, we construct the families of F-Fanions and S-Fanions, respectively. Their most prominent members are F-Muon and S-Muon. We complement our theoretical analysis with an extensive empirical study of these algorithms across a wide range of tasks and settings, demonstrating that F-Muon and S-Muon consistently match Muon's performance, while outperforming vanilla Muon on a synthetic linear least squares problem.
8.6NAApr 15
Subset selection for matrices by column exchangeAlexander Osinsky, Ivan Kozyrev
The paper considers the problem of finding a submatrix $X_{\mathcal{S}} \in \mathbb{R}^{m \times k}$ in a matrix $X \in \mathbb{R}^{m \times n}$, such that the spectral or Frobenius norm of $X_{\mathcal{S}}^† X$ is limited, which guarantees it provides a good representation of the whole matrix. Such bounds can be reached by applying greedy algorithms, maximizing the submatrix volume. We suggest a modification of a greedy volume maximization, which performs column exchanges asymptotically faster for $n \gg m$ than the known alternatives, while guaranteeing the same bounds on $X_{\mathcal{S}}^† X$. In addition, we prove a new upper bound on the number of required exchanges, which is applicable to the new algorithm as well as to other greedy volume maximization algorithms.