MLNov 21, 2022
A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant AnalysisDong Min Roh, Zhaojun Bai, Ren-Cang Li
Much like the classical Fisher linear discriminant analysis (LDA), the recently proposed Wasserstein discriminant analysis (WDA) is a linear dimensionality reduction method that seeks a projection matrix to maximize the dispersion of different data classes and minimize the dispersion of same data classes via a bi-level optimization. In contrast to LDA, WDA can account for both global and local interconnections between data classes by using the underlying principles of optimal transport. In this paper, a bi-level nonlinear eigenvector algorithm (WDA-nepv) is presented to fully exploit the structures of the bi-level optimization of WDA. The inner level of WDA-nepv for computing the optimal transport matrices is formulated as an eigenvector-dependent nonlinear eigenvalue problem (NEPv), and meanwhile, the outer level for trace ratio optimizations is formulated as another NEPv. Both NEPvs can be computed efficiently under the self-consistent field (SCF) framework. WDA-nepv is derivative-free and surrogate-model-free when compared with existing algorithms. Convergence analysis of the proposed WDA-nepv justifies the utilization of the SCF for solving the bi-level optimization of WDA. Numerical experiments with synthetic and real-life datasets demonstrate the classification accuracy and scalability of WDA-nepv.
67.8NAMay 18
Convergence Analysis of Two Alternating Iterative Schemes for Tucker DecompositionRen-Cang Li, Li Wang, Mei Yang
The higher-order orthogonal iteration (HOOI) and the alternating subspace iteration (ASI) are two popular numerical methods for computing the Tucker decomposition of a multiple-mode tensor. Xu [Linear and Multilinear Algebra, 66(11):2247--2265, 2018] proposed a variation of HOOI, called the greedy HOOI, which has an extra alignment action between consecutive approximations. Kroonenberg and De Leeuw [Psychometrika, 45(1):69--97, 1980] analyzed the convergence of ASI but their analysis has gaps. These analysis were for a real tensor only. In this paper, we present detailed convergence analysis of the two methods that is applicable to a complex tensor with a real tensor being a special case, and it is shown both methods are globally convergent to stationary points under mild conditions while the objective function monotonically increases. Numerical examples are presented to demonstrate the convergence behavior of the methods.
82.7NAMay 13
An NPDo Approach for Tensor Block-DiagonalizationRen-Cang Li, Li Wang, Mei Yang
This paper is concerned with Partial Tensor Block-Diagonalization of a multiway tensor by orthonormal matrices so that the extracted block-diagonal part optimally represents the tensor. The basic idea is to maximize the block-diagonal part via the tensor's mode-multiplications by orthonormal matrices. For that reason, it will be referred to Principal Tensor Block-Diagonalization (PTBD), which contains the Tucker decomposition (TD) of a tensor as a special case with just one block. Also as a special case is the approximate dominant tensor SVD in which each block-size is 1-by-1. An NPDo approach is proposed to optimize the block-diagonal part for computing \ptbd. It is shown the NPDo approach combined with Gauss-Seidel-type updating is globally convergent to a stationary point while the objective increases monotonically. Numerical experiments are presented to illustrate the efficiency of the NPDo approach.
14.9NAMay 9
An NPDo Approach for Principal Joint SVD-type Block DiagonalizationRen-Cang Li, Li Wang, Mei Yang
This paper is concerned with partial Joint SVD-type Block Diagonalization of several matrices so that the extracted diagonal parts collectively optimally assume part of the total mass of all given matrices. For that reason, it will be referred also as Principal Joint SVD-type Block Diagonalization. When each block-size is 1-by-1, it is about finding a dominant partial joint SVD decomposition for the matrices of interests. An NPDo approach is proposed for maximizing the common dominant block-diagonal parts collectively. It is shown that the NPDo approach combined with Gauss-Seidel-type updating is globally convergent to a stationary point while the objective increases monotonically. Numerical experiments are presented to illustrate the efficiency of the NPDo approach.
SYNov 6, 2024
A Comparative Study of Deep Reinforcement Learning for Crop Production ManagementJoseph Balderas, Dong Chen, Yanbo Huang et al.
Crop production management is essential for optimizing yield and minimizing a field's environmental impact to crop fields, yet it remains challenging due to the complex and stochastic processes involved. Recently, researchers have turned to machine learning to address these complexities. Specifically, reinforcement learning (RL), a cutting-edge approach designed to learn optimal decision-making strategies through trial and error in dynamic environments, has emerged as a promising tool for developing adaptive crop management policies. RL models aim to optimize long-term rewards by continuously interacting with the environment, making them well-suited for tackling the uncertainties and variability inherent in crop management. Studies have shown that RL can generate crop management policies that compete with, and even outperform, expert-designed policies within simulation-based crop models. In the gym-DSSAT crop model environment, one of the most widely used simulators for crop management, proximal policy optimization (PPO) and deep Q-networks (DQN) have shown promising results. However, these methods have not yet been systematically evaluated under identical conditions. In this study, we evaluated PPO and DQN against static baseline policies across three different RL tasks, fertilization, irrigation, and mixed management, provided by the gym-DSSAT environment. To ensure a fair comparison, we used consistent default parameters, identical reward functions, and the same environment settings. Our results indicate that PPO outperforms DQN in fertilization and irrigation tasks, while DQN excels in the mixed management task. This comparative analysis provides critical insights into the strengths and limitations of each approach, advancing the development of more effective RL-based crop management strategies.
OCJan 12, 2021
Trace Ratio Optimization with an Application to Multi-view LearningLi Wang, Lei-Hong Zhang, Ren-Cang Li
A trace ratio optimization problem over the Stiefel manifold is investigated from the perspectives of both theory and numerical computations. At least three special cases of the problem have arisen from Fisher linear discriminant analysis, canonical correlation analysis, and unbalanced Procrustes problem, respectively. Necessary conditions in the form of nonlinear eigenvalue problem with eigenvector dependency are established and a numerical method based on the self-consistent field (SCF) iteration is designed and proved to be always convergent. As an application to multi-view subspace learning, a new framework and its instantiated concrete models are proposed and demonstrated on real world data sets. Numerical results show that the efficiency of the proposed numerical methods and effectiveness of the new multi-view subspace learning models.
LGNov 22, 2020
Uncorrelated Semi-paired Subspace LearningLi Wang, Lei-Hong Zhang, Chungen Shen et al.
Multi-view datasets are increasingly collected in many real-world applications, and we have seen better learning performance by existing multi-view learning methods than by conventional single-view learning methods applied to each view individually. But, most of these multi-view learning methods are built on the assumption that at each instance no view is missing and all data points from all views must be perfectly paired. Hence they cannot handle unpaired data but ignore them completely from their learning process. However, unpaired data can be more abundant in reality than paired ones and simply ignoring all unpaired data incur tremendous waste in resources. In this paper, we focus on learning uncorrelated features by semi-paired subspace learning, motivated by many existing works that show great successes of learning uncorrelated features. Specifically, we propose a generalized uncorrelated multi-view subspace learning framework, which can naturally integrate many proven learning criteria on the semi-paired data. To showcase the flexibility of the framework, we instantiate five new semi-paired models for both unsupervised and semi-supervised learning. We also design a successive alternating approximation (SAA) method to solve the resulting optimization problem and the method can be combined with the powerful Krylov subspace projection technique if needed. Extensive experimental results on multi-view feature extraction and multi-modality classification show that our proposed models perform competitively to or better than the baselines.
LGOct 4, 2020
Orthogonal Multi-view Analysis by Successive Approximations via EigenvectorsLi Wang, Leihong Zhang, Chungen Shen et al.
We propose a unified framework for multi-view subspace learning to learn individual orthogonal projections for all views. The framework integrates the correlations within multiple views, supervised discriminant capacity, and distance preservation in a concise and compact way. It not only includes several existing models as special cases, but also inspires new novel models. To demonstrate its versatility to handle different learning scenarios, we showcase three new multi-view discriminant analysis models and two new multi-view multi-label classification ones under this framework. An efficient numerical method based on successive approximations via eigenvectors is presented to solve the associated optimization problem. The method is built upon an iterative Krylov subspace method which can easily scale up for high-dimensional datasets. Extensive experiments are conducted on various real-world datasets for multi-view discriminant analysis and multi-view multi-label classification. The experimental results demonstrate that the proposed models are consistently competitive to and often better than the compared methods that do not learn orthogonal projections.
LGJul 9, 2020
Multi-view Orthonormalized Partial Least Squares: Regularizations and Deep ExtensionsLi Wang, Ren-Cang Li, Wen-Wei
We establish a family of subspace-based learning method for multi-view learning using the least squares as the fundamental basis. Specifically, we investigate orthonormalized partial least squares (OPLS) and study its important properties for both multivariate regression and classification. Building on the least squares reformulation of OPLS, we propose a unified multi-view learning framework to learn a classifier over a common latent space shared by all views. The regularization technique is further leveraged to unleash the power of the proposed framework by providing three generic types of regularizers on its inherent ingredients including model parameters, decision values and latent projected points. We instantiate a set of regularizers in terms of various priors. The proposed framework with proper choices of regularizers not only can recast existing methods, but also inspire new models. To further improve the performance of the proposed framework on complex real problems, we propose to learn nonlinear transformations parameterized by deep networks. Extensive experiments are conducted to compare various methods on nine data sets with different numbers of views in terms of both feature extraction and cross-modal retrieval.
NANov 7, 2019
Linear Constrained Rayleigh Quotient Optimization: Theory and AlgorithmsYunshen Zhou, Zhaojun Bai, Ren-Cang Li
We consider the following constrained Rayleigh quotient optimization problem (CRQopt) $$ \min_{x\in \mathbb{R}^n} x^{T}Ax\,\,\mbox{subject to}\,\, x^{T}x=1\,\mbox{and}\,C^{T}x=b, $$ where $A$ is an $n\times n$ real symmetric matrix and $C$ is an $n\times m$ real matrix. Usually, $m\ll n$. The problem is also known as the constrained eigenvalue problem in the literature because it becomes an eigenvalue problem if the linear constraint $C^{T}x=b$ is removed. We start by equivalently transforming CRQopt into an optimization problem, called LGopt, of minimizing the Lagrangian multiplier of CRQopt, and then an problem, called QEPmin, of finding the smallest eigenvalue of a quadratic eigenvalue problem. Although such equivalences has been discussed in the literature, it appears to be the first time that these equivalences are rigorously justified. Then we propose to numerically solve LGopt and QEPmin by the Krylov subspace projection method via the Lanczos process. The basic idea, as the Lanczos method for the symmetric eigenvalue problem, is to first reduce LGopt and QEPmin by projecting them onto Krylov subspaces to yield problems of the same types but of much smaller sizes, and then solve the reduced problems by some direct methods, which is either a secular equation solver (in the case of LGopt) or an eigensolver (in the case of QEPmin). The resulting algorithm is called the Lanczos algorithm. We perform convergence analysis for the proposed method and obtain error bounds. The sharpness of the error bound is demonstrated by artificial examples, although in applications the method often converges much faster than the bounds suggest. Finally, we apply the Lanczos algorithm to semi-supervised learning in the context of constrained clustering.
LGSep 25, 2019
A Self-consistent-field Iteration for Orthogonal Canonical Correlation AnalysisLeihong Zhang, Li Wang, Zhaojun Bai et al.
We propose an efficient algorithm for solving orthogonal canonical correlation analysis (OCCA) in the form of trace-fractional structure and orthogonal linear projections. Even though orthogonality has been widely used and proved to be a useful criterion for pattern recognition and feature extraction, existing methods for solving OCCA problem are either numerical unstable by relying on a deflation scheme, or less efficient by directly using generic optimization methods. In this paper, we propose an alternating numerical scheme whose core is the sub-maximization problem in the trace-fractional form with an orthogonal constraint. A customized self-consistent-field (SCF) iteration for this sub-maximization problem is devised. It is proved that the SCF iteration is globally convergent to a KKT point and that the alternating numerical scheme always converges. We further formulate a new trace-fractional maximization problem for orthogonal multiset CCA (OMCCA) and then propose an efficient algorithm with an either Jacobi-style or Gauss-Seidel-style updating scheme based on the same SCF iteration. Extensive experiments are conducted to evaluate the proposed algorithms against existing methods including two real world applications: multi-label classification and multi-view feature extraction. Experimental results show that our methods not only perform competitively to or better than baselines but also are more efficient.