LGJan 25, 2023
A Provable Splitting Approach for Symmetric Nonnegative Matrix FactorizationXiao Li, Zhihui Zhu, Qiuwei Li et al.
The symmetric Nonnegative Matrix Factorization (NMF), a special but important class of the general NMF, has found numerous applications in data analysis such as various clustering tasks. Unfortunately, designing fast algorithms for the symmetric NMF is not as easy as for its nonsymmetric counterpart, since the latter admits the splitting property that allows state-of-the-art alternating-type algorithms. To overcome this issue, we first split the decision variable and transform the symmetric NMF to a penalized nonsymmetric one, paving the way for designing efficient alternating-type algorithms. We then show that solving the penalized nonsymmetric reformulation returns a solution to the original symmetric NMF. Moreover, we design a family of alternating-type algorithms and show that they all admit strong convergence guarantee: the generated sequence of iterates is convergent and converges at least sublinearly to a critical point of the original symmetric NMF. Finally, we conduct experiments on both synthetic data and real image clustering to support our theoretical results and demonstrate the performance of the alternating-type algorithms.
28.7ITApr 8
On A Class of Greedy Sparse Recovery AlgorithmsGang Li, Qiuwei Li, Shuang Li et al.
Sparse signal recovery deals with finding the sparsest solution of an under-determined linear system $\vx = \mQ\vs$. In this paper, we propose a novel greedy approach to addressing the challenges from such a problem. Such an approach is based on a characterization of solutions to the system, which allows us to work on the sparse recovery in the $\vs$-space directly with a given measure. With $l_2$-based measure, an orthogonal matching pursuit (OMP)-type algorithm is proposed, which significantly outperforms the classical OMP algorithm in terms of recovery accuracy while maintaining comparable computational complexity. An $l_1$-based algorithm, denoted as $\text{Alg}_{GL1}$, is derived. Such an algorithm significantly outperforms the classical basis pursuit (BP) algorithm. Combining with the CoSaMP-strategy for selecting atoms, a class of high performance greedy algorithms is also derived. Extensive numerical simulations on both synthetic and image data are carried out, with which the superior performance of our proposed algorithms is demonstrated in terms of sparse recovery accuracy and robustness against numerical instability of the system matrix $\mQ$ and disturbance in the measurement $\vx$.
LGJun 2, 2021
Operator Splitting for Learning to Predict Equilibria in Convex GamesDaniel McKenzie, Howard Heaton, Qiuwei Li et al.
Systems of competing agents can often be modeled as games. Assuming rationality, the most likely outcomes are given by an equilibrium (e.g. a Nash equilibrium). In many practical settings, games are influenced by context, i.e. additional data beyond the control of any agent (e.g. weather for traffic and fiscal policy for market economies). Often the exact game mechanics are unknown, yet vast amounts of historical data consisting of (context, equilibrium) pairs are available, raising the possibility of learning a solver which predicts the equilibria given only the context. We introduce Nash Fixed Point Networks (N-FPNs), a class of neural networks that naturally output equilibria. Crucially, N- FPNs employ a constraint decoupling scheme to handle complicated agent action sets while avoiding expensive projections. Empirically, we find N-FPNs are compatible with the recently developed Jacobian-Free Backpropagation technique for training implicit networks, making them significantly faster and easier to train than prior models. Our experiments show N-FPNs are capable of scaling to problems orders of magnitude larger than existing learned game solvers.
LGMar 23, 2021
JFB: Jacobian-Free Backpropagation for Implicit NetworksSamy Wu Fung, Howard Heaton, Qiuwei Li et al.
A promising trend in deep learning replaces traditional feedforward networks with implicit networks. Unlike traditional networks, implicit networks solve a fixed point equation to compute inferences. Solving for the fixed point varies in complexity, depending on provided data and an error tolerance. Importantly, implicit networks may be trained with fixed memory costs in stark contrast to feedforward networks, whose memory requirements scale linearly with depth. However, there is no free lunch -- backpropagation through implicit networks often requires solving a costly Jacobian-based equation arising from the implicit function theorem. We propose Jacobian-Free Backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobian-based equations. JFB makes implicit networks faster to train and significantly easier to implement, without sacrificing test accuracy. Our experiments show implicit networks trained with JFB are competitive with feedforward networks and prior implicit networks given the same number of parameters.
OCApr 22, 2019
Provable Bregman-divergence based Methods for Nonconvex and Non-Lipschitz ProblemsQiuwei Li, Zhihui Zhu, Gongguo Tang et al.
The (global) Lipschitz smoothness condition is crucial in establishing the convergence theory for most optimization methods. Unfortunately, most machine learning and signal processing problems are not Lipschitz smooth. This motivates us to generalize the concept of Lipschitz smoothness condition to the relative smoothness condition, which is satisfied by any finite-order polynomial objective function. Further, this work develops new Bregman-divergence based algorithms that are guaranteed to converge to a second-order stationary point for any relatively smooth problem. In addition, the proposed optimization methods cover both the proximal alternating minimization and the proximal alternating linearized minimization when we specialize the Bregman divergence to the Euclidian distance. Therefore, this work not only develops guaranteed optimization methods for non-Lipschitz smooth problems but also solves an open problem of showing the second-order convergence guarantees for these alternating minimization methods.
LGMar 16, 2019
Spherical Principal Component AnalysisKai Liu, Qiuwei Li, Hua Wang et al.
Principal Component Analysis (PCA) is one of the most important methods to handle high dimensional data. However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis. In this paper, we propose a method by adding constraints on factors to unify the Euclidean distance and angle distance. However, due to the nonconvexity of the objective and constraints, the optimized solution is not easy to obtain. We propose an alternating linearized minimization method to solve it with provable convergence rate and guarantee. Experiments on synthetic data and real-world datasets have validated the effectiveness of our method and demonstrated its advantages over state-of-art clustering methods.
LGNov 14, 2018
Dropping Symmetry for Fast Symmetric Nonnegative Matrix FactorizationZhihui Zhu, Xiao Li, Kai Liu et al.
Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks. Unfortunately, designing fast algorithms for Symmetric NMF is not as easy as for the nonsymmetric counterpart, the latter admitting the splitting property that allows efficient alternating-type algorithms. To overcome this issue, we transfer the symmetric NMF to a nonsymmetric one, then we can adopt the idea from the state-of-the-art algorithms for nonsymmetric NMF to design fast algorithms solving symmetric NMF. We rigorously establish that solving nonsymmetric reformulation returns a solution for symmetric NMF and then apply fast alternating based algorithms for the corresponding reformulated problem. Furthermore, we show these fast algorithms admit strong convergence guarantee in the sense that the generated sequence is convergent at least at a sublinear rate and it converges globally to a critical point of the symmetric NMF. We conduct experiments on both synthetic data and image clustering to support our result.
OCNov 7, 2018
Global Optimality in Distributed Low-rank Matrix FactorizationZhihui Zhu, Qiuwei Li, Xinshuo Yang et al.
We study the convergence of a variant of distributed gradient descent (DGD) on a distributed low-rank matrix approximation problem wherein some optimization variables are used for consensus (as in classical DGD) and some optimization variables appear only locally at a single node in the network. We term the resulting algorithm DGD+LOCAL. Using algorithmic connections to gradient descent and geometric connections to the well-behaved landscape of the centralized low-rank matrix approximation problem, we identify sufficient conditions where DGD+LOCAL is guaranteed to converge with exact consensus to a global minimizer of the original centralized problem. For the distributed low-rank matrix approximation problem, these guarantees are stronger---in terms of consensus and optimality---than what appear in the literature for classical DGD and more general problems.
SPSep 19, 2017
Optimized Structured Sparse Sensing Matrices for Compressive SensingTao Hong, Xiao Li, Zhihui Zhu et al.
We consider designing a robust structured sparse sensing matrix consisting of a sparse matrix with a few non-zero entries per row and a dense base matrix for capturing signals efficiently We design the robust structured sparse sensing matrix through minimizing the distance between the Gram matrix of the equivalent dictionary and the target Gram of matrix holding small mutual coherence. Moreover, a regularization is added to enforce the robustness of the optimized structured sparse sensing matrix to the sparse representation error (SRE) of signals of interests. An alternating minimization algorithm with global sequence convergence is proposed for solving the corresponding optimization problem. Numerical experiments on synthetic data and natural images show that the obtained structured sensing matrix results in a higher signal reconstruction than a random dense sensing matrix.
NAApr 5, 2017
Geometry of Factored Nuclear Norm RegularizationQiuwei Li, Zhihui Zhu, Gongguo Tang
This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $\|X\|_*$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in literature using convex analysis techniques. Despite its optimal performance, the resulting optimization has high computational complexity when solved using standard or even tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer-Monteiro to factor the matrix variable $X$ into the product of two smaller rectangular matrices $X=UV^T$ and also replace the nuclear norm $\|X\|_*$ with $(\|U\|_F^2+\|V\|_F^2)/2$. In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r,4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^\star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation allows many local search algorithms to converge to the global optimum with random initializations.