LGOct 28, 2022
Scalable Spectral Clustering with Group Fairness ConstraintsJi Wang, Ding Lu, Ian Davidson et al.
There are synergies of research interests and industrial efforts in modeling fairness and correcting algorithmic bias in machine learning. In this paper, we present a scalable algorithm for spectral clustering (SC) with group fairness constraints. Group fairness is also known as statistical parity where in each cluster, each protected group is represented with the same proportion as in the entirety. While FairSC algorithm (Kleindessner et al., 2019) is able to find the fairer clustering, it is compromised by high costs due to the kernels of computing nullspaces and the square roots of dense matrices explicitly. We present a new formulation of underlying spectral computation by incorporating nullspace projection and Hotelling's deflation such that the resulting algorithm, called s-FairSC, only involves the sparse matrix-vector products and is able to fully exploit the sparsity of the fair SC model. The experimental results on the modified stochastic block model demonstrate that s-FairSC is comparable with FairSC in recovering fair clustering. Meanwhile, it is sped up by a factor of 12 for moderate model sizes. s-FairSC is further demonstrated to be scalable in the sense that the computational costs of s-FairSC only increase marginally compared to the SC without fairness constraints.
NAMay 30, 2016
Convergence analysis of a locally accelerated preconditioned steepest descent method for Hermitian-definite generalized eigenvalue problemsYunfeng Cai, Zhaojun Bai, John E. Pask et al.
By extending the classical analysis techniques due to Samokish, Faddeev and Faddeeva, and Longsine and McCormick among others, we prove the convergence of preconditioned steepest descent with implicit deflation (PSD-id) method for solving Hermitian-definite generalized eigenvalue problems. Furthermore, we derive a nonasymptotic estimate of the rate of convergence of the \psdid method. We show that with the proper choice of the shift, the indefinite shift-and-invert preconditioner is a locally accelerated preconditioner, and is asymptotically optimal that leads to superlinear convergence. Numerical examples are presented to verify the theoretical results on the convergence behavior of the \psdid method for solving ill-conditioned Hermitian-definite generalized eigenvalue problems arising from electronic structure calculations. While rigorous and full-scale convergence proofs of preconditioned block steepest descent methods in practical use still largely eludes us, we believe the theoretical results presented in this paper sheds light on an improved understanding of the convergence behavior of these block methods.
MLNov 21, 2022
A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant AnalysisDong Min Roh, Zhaojun Bai, Ren-Cang Li
Much like the classical Fisher linear discriminant analysis (LDA), the recently proposed Wasserstein discriminant analysis (WDA) is a linear dimensionality reduction method that seeks a projection matrix to maximize the dispersion of different data classes and minimize the dispersion of same data classes via a bi-level optimization. In contrast to LDA, WDA can account for both global and local interconnections between data classes by using the underlying principles of optimal transport. In this paper, a bi-level nonlinear eigenvector algorithm (WDA-nepv) is presented to fully exploit the structures of the bi-level optimization of WDA. The inner level of WDA-nepv for computing the optimal transport matrices is formulated as an eigenvector-dependent nonlinear eigenvalue problem (NEPv), and meanwhile, the outer level for trace ratio optimizations is formulated as another NEPv. Both NEPvs can be computed efficiently under the self-consistent field (SCF) framework. WDA-nepv is derivative-free and surrogate-model-free when compared with existing algorithms. Convergence analysis of the proposed WDA-nepv justifies the utilization of the SCF for solving the bi-level optimization of WDA. Numerical experiments with synthetic and real-life datasets demonstrate the classification accuracy and scalability of WDA-nepv.
NAApr 25, 2010
Error Control of Iterative Linear Solvers for Integrated Groundwater ModelsMatthew Dixon, Zhaojun Bai, Charles Brush et al.
An open problem that arises when using modern iterative linear solvers, such as the preconditioned conjugate gradient (PCG) method or Generalized Minimum RESidual method (GMRES) is how to choose the residual tolerance in the linear solver to be consistent with the tolerance on the solution error. This problem is especially acute for integrated groundwater models which are implicitly coupled to another model, such as surface water models, and resolve both multiple scales of flow and temporal interaction terms, giving rise to linear systems with variable scaling. This article uses the theory of 'forward error bound estimation' to show how rescaling the linear system affects the correspondence between the residual error in the preconditioned linear system and the solution error. Using examples of linear systems from models developed using the USGS GSFLOW package and the California State Department of Water Resources' Integrated Water Flow Model (IWFM), we observe that this error bound guides the choice of a practical measure for controlling the error in rescaled linear systems. It is found that forward error can be controlled in preconditioned GMRES by rescaling the linear system and normalizing the stopping tolerance. We implemented a preconditioned GMRES algorithm and benchmarked it against the Successive-Over-Relaxation (SOR) method. Improved error control reduces redundant iterations in the GMRES algorithm and results in overall simulation speedups as large as 7.7x. This research is expected to broadly impact groundwater modelers through the demonstration of a practical approach for setting the residual tolerance in line with the solution error tolerance.
LGMar 1, 2025
Hidden Convexity of Fair PCA and Fast Solver via Eigenvalue OptimizationJunhui Shen, Aaron J. Davis, Ding Lu et al.
Principal Component Analysis (PCA) is a foundational technique in machine learning for dimensionality reduction of high-dimensional datasets. However, PCA could lead to biased outcomes that disadvantage certain subgroups of the underlying datasets. To address the bias issue, a Fair PCA (FPCA) model was introduced by Samadi et al. (2018) for equalizing the reconstruction loss between subgroups. The semidefinite relaxation (SDR) based approach proposed by Samadi et al. (2018) is computationally expensive even for suboptimal solutions. To improve efficiency, several alternative variants of the FPCA model have been developed. These variants often shift the focus away from equalizing the reconstruction loss. In this paper, we identify a hidden convexity in the FPCA model and introduce an algorithm for convex optimization via eigenvalue optimization. Our approach achieves the desired fairness in reconstruction loss without sacrificing performance. As demonstrated in real-world datasets, the proposed FPCA algorithm runs $8\times$ faster than the SDR-based algorithm, and only at most 85% slower than the standard PCA.
NANov 7, 2019
Linear Constrained Rayleigh Quotient Optimization: Theory and AlgorithmsYunshen Zhou, Zhaojun Bai, Ren-Cang Li
We consider the following constrained Rayleigh quotient optimization problem (CRQopt) $$ \min_{x\in \mathbb{R}^n} x^{T}Ax\,\,\mbox{subject to}\,\, x^{T}x=1\,\mbox{and}\,C^{T}x=b, $$ where $A$ is an $n\times n$ real symmetric matrix and $C$ is an $n\times m$ real matrix. Usually, $m\ll n$. The problem is also known as the constrained eigenvalue problem in the literature because it becomes an eigenvalue problem if the linear constraint $C^{T}x=b$ is removed. We start by equivalently transforming CRQopt into an optimization problem, called LGopt, of minimizing the Lagrangian multiplier of CRQopt, and then an problem, called QEPmin, of finding the smallest eigenvalue of a quadratic eigenvalue problem. Although such equivalences has been discussed in the literature, it appears to be the first time that these equivalences are rigorously justified. Then we propose to numerically solve LGopt and QEPmin by the Krylov subspace projection method via the Lanczos process. The basic idea, as the Lanczos method for the symmetric eigenvalue problem, is to first reduce LGopt and QEPmin by projecting them onto Krylov subspaces to yield problems of the same types but of much smaller sizes, and then solve the reduced problems by some direct methods, which is either a secular equation solver (in the case of LGopt) or an eigensolver (in the case of QEPmin). The resulting algorithm is called the Lanczos algorithm. We perform convergence analysis for the proposed method and obtain error bounds. The sharpness of the error bound is demonstrated by artificial examples, although in applications the method often converges much faster than the bounds suggest. Finally, we apply the Lanczos algorithm to semi-supervised learning in the context of constrained clustering.
LGSep 25, 2019
A Self-consistent-field Iteration for Orthogonal Canonical Correlation AnalysisLeihong Zhang, Li Wang, Zhaojun Bai et al.
We propose an efficient algorithm for solving orthogonal canonical correlation analysis (OCCA) in the form of trace-fractional structure and orthogonal linear projections. Even though orthogonality has been widely used and proved to be a useful criterion for pattern recognition and feature extraction, existing methods for solving OCCA problem are either numerical unstable by relying on a deflation scheme, or less efficient by directly using generic optimization methods. In this paper, we propose an alternating numerical scheme whose core is the sub-maximization problem in the trace-fractional form with an orthogonal constraint. A customized self-consistent-field (SCF) iteration for this sub-maximization problem is devised. It is proved that the SCF iteration is globally convergent to a KKT point and that the alternating numerical scheme always converges. We further formulate a new trace-fractional maximization problem for orthogonal multiset CCA (OMCCA) and then propose an efficient algorithm with an either Jacobi-style or Gauss-Seidel-style updating scheme based on the same SCF iteration. Extensive experiments are conducted to evaluate the proposed algorithms against existing methods including two real world applications: multi-label classification and multi-view feature extraction. Experimental results show that our methods not only perform competitively to or better than baselines but also are more efficient.