Longxiu Huang

LG
h-index30
23papers
1,237citations
Novelty48%
AI Score49

23 Papers

LGAug 20, 2022
Matrix Completion with Cross-Concentrated Sampling: Bridging Uniform Sampling and CUR Sampling

HanQin Cai, Longxiu Huang, Pengyu Li et al.

While uniform sampling has been widely studied in the matrix completion literature, CUR sampling approximates a low-rank matrix via row and column samples. Unfortunately, both sampling models lack flexibility for various circumstances in real-world applications. In this work, we propose a novel and easy-to-implement sampling strategy, coined Cross-Concentrated Sampling (CCS). By bridging uniform sampling and CUR sampling, CCS provides extra flexibility that can potentially save sampling costs in applications. In addition, we also provide a sufficient condition for CCS-based matrix completion. Moreover, we propose a highly efficient non-convex algorithm, termed Iterative CUR Completion (ICURC), for the proposed CCS model. Numerical experiments verify the empirical advantages of CCS and ICURC against uniform sampling and its baseline algorithms, on both synthetic and real-world datasets.

OCFeb 24, 2023
Randomized Kaczmarz in Adversarial Distributed Setting

Longxiu Huang, Xia Li, Deanna Needell

Developing large-scale distributed methods that are robust to the presence of adversarial or corrupted workers is an important part of making such methods practical for real-world problems. In this paper, we propose an iterative approach that is adversary-tolerant for convex optimization problems. By leveraging simple statistics, our method ensures convergence and is capable of adapting to adversarial distributions. Additionally, the efficiency of the proposed methods for solving convex problems is shown in simulations with the presence of adversaries. Through simulations, we demonstrate the efficiency of our approach in the presence of adversaries and its ability to identify adversarial workers with high accuracy and tolerate varying levels of adversary rates.

LGMar 17, 2023
Non-convex approaches for low-rank tensor completion under tubal sampling

Zheng Tan, Longxiu Huang, HanQin Cai et al.

Tensor completion is an important problem in modern data analysis. In this work, we investigate a specific sampling strategy, referred to as tubal sampling. We propose two novel non-convex tensor completion frameworks that are easy to implement, named tensor $L_1$-$L_2$ (TL12) and tensor completion via CUR (TCCUR). We test the efficiency of both methods on synthetic data and a color image inpainting problem. Empirical results reveal a trade-off between the accuracy and time efficiency of these two methods in a low sampling ratio. Each of them outperforms some classical completion methods in at least one aspect.

NAMay 13
Revisiting CUR Perturbation Analysis: A Local Tangent-Space Expansion

Longxiu Huang

CUR decompositions approximate a matrix using selected columns, rows, and their intersection. Classical CUR theory provides exactness results for low-rank matrices and perturbation bounds controlled by the size of the noise. In this work we develop a local perturbation expansion for a fixed-index rank-truncated CUR map near an admissible rank-\(r\) matrix. We show that the Fréchet derivative of the rank-truncated CUR map is a sampling-induced oblique tangent-space projector determined by the selected rows and columns. Consequently, the local recovery error for an underlying low-rank matrix is governed not by the full perturbation norm alone, but by the image of the perturbation under this sampling-induced tangent projector. In particular, perturbations that are invisible to the selected rows and columns are removed to first order. We compare this behavior with the classical local expansion of the rank-\(r\) SVD truncation. SVD removes orthogonal-normal perturbations to first order, whereas rank-truncated CUR removes perturbations in the kernel of the sampling-induced oblique tangent projector. Numerical experiments illustrate these regimes and confirm the predicted first- and second-order local rates.

LGJan 30, 2024
Coseparable Nonnegative Tensor Factorization With T-CUR Decomposition

Juefei Chen, Longxiu Huang, Yimin Wei

Nonnegative Matrix Factorization (NMF) is an important unsupervised learning method to extract meaningful features from data. To address the NMF problem within a polynomial time framework, researchers have introduced a separability assumption, which has recently evolved into the concept of coseparability. This advancement offers a more efficient core representation for the original data. However, in the real world, the data is more natural to be represented as a multi-dimensional array, such as images or videos. The NMF's application to high-dimensional data involves vectorization, which risks losing essential multi-dimensional correlations. To retain these inherent correlations in the data, we turn to tensors (multidimensional arrays) and leverage the tensor t-product. This approach extends the coseparable NMF to the tensor setting, creating what we term coseparable Nonnegative Tensor Factorization (NTF). In this work, we provide an alternating index selection method to select the coseparable core. Furthermore, we validate the t-CUR sampling theory and integrate it with the tensor Discrete Empirical Interpolation Method (t-DEIM) to introduce an alternative, randomized index selection process. These methods have been tested on both synthetic and facial analysis datasets. The results demonstrate the efficiency of coseparable NTF when compared to coseparable NMF.

MLJan 28, 2024
On the Robustness of Cross-Concentrated Sampling for Matrix Completion

HanQin Cai, Longxiu Huang, Chandra Kundu et al.

Matrix completion is one of the crucial tools in modern data science research. Recently, a novel sampling model for matrix completion coined cross-concentrated sampling (CCS) has caught much attention. However, the robustness of the CCS model against sparse outliers remains unclear in the existing studies. In this paper, we aim to answer this question by exploring a novel Robust CCS Completion problem. A highly efficient non-convex iterative algorithm, dubbed Robust CUR Completion (RCURC), is proposed. The empirical performance of the proposed algorithm, in terms of both efficiency and robustness, is verified in synthetic and real datasets.

SPFeb 4, 2025
Three-dimensional signal processing: a new approach in dynamical sampling via tensor products

Yisen Wang, Hanqin Cai, Longxiu Huang

The dynamical sampling problem is centered around reconstructing signals that evolve over time according to a dynamical process, from spatial-temporal samples that may be noisy. This topic has been thoroughly explored for one-dimensional signals. Multidimensional signal recovery has also been studied, but primarily in scenarios where the driving operator is a convolution operator. In this work, we shift our focus to the dynamical sampling problem in the context of three-dimensional signal recovery, where the evolution system can be characterized by tensor products. Specifically, we provide a necessary condition for the sampling set that ensures successful recovery of the three-dimensional signal. Furthermore, we reformulate the reconstruction problem as an optimization task, which can be solved efficiently. To demonstrate the effectiveness of our approach, we include some straightforward numerical simulations that showcase the reconstruction performance.

CVJan 8, 2025
Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

Lanlan Feng, Ce Zhu, Yipeng Liu et al.

Robust tensor principal component analysis (RTPCA) aims to separate the low-rank and sparse components from multi-dimensional data, making it an essential technique in the signal processing and computer vision fields. Recently emerging tensor singular value decomposition (t-SVD) has gained considerable attention for its ability to better capture the low-rank structure of tensors compared to traditional matrix SVD. However, existing methods often rely on the computationally expensive tensor nuclear norm (TNN), which limits their scalability for real-world tensors. To address this issue, we explore an efficient scaled gradient descent (SGD) approach within the t-SVD framework for the first time, and propose the RTPCA-SGD method. Theoretically, we rigorously establish the recovery guarantees of RTPCA-SGD under mild assumptions, demonstrating that with appropriate parameter selection, it achieves linear convergence to the true low-rank tensor at a constant rate, independent of the condition number. To enhance its practical applicability, we further propose a learnable self-supervised deep unfolding model, which enables effective parameter learning. Numerical experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed methods while maintaining competitive computational efficiency, especially consuming less time than RTPCA-TNN.

ITApr 10
Robust Spectral Recovery for Dynamical Sampling

HanQin Cai, Longxiu Huang, Tianming Wang et al.

We study the spectral recovery problem for dynamical sampling on a finite cyclic grid. Given time snapshots obtained from a fixed uniform spatial subsampling of the orbit $x_{\ell}=A^{\ell}f$, we aim to recover the spectrum of the unknown circular convolution operator $A$. However, in the presence of outliers, even in only a few snapshots, existing approaches often struggle to recover the spectrum. We address this challenge by proposing a novel robust spectral recovery model in the presence of time-sparse corruptions. We propose a robust pipeline that lifts the problem to a sequence of robust low-rank Hankel recovery and completion tasks, followed by Prony-type spectral estimation. Numerical experiments confirm the accurate spectral recovery of the proposed approach and exhibit its superior robustness against state-of-the-art under various settings.

NASep 20, 2025
Randomized Space-Time Sampling for Affine Graph Dynamical Systems

Le Gong, Longxiu Huang

This paper investigates the problem of dynamical sampling for graph signals influenced by a constant source term. We consider signals evolving over time according to a linear dynamical system on a graph, where both the initial state and the source term are bandlimited. We introduce two random space-time sampling regimes and analyze the conditions under which stable recovery is achievable. While our framework extends recent work on homogeneous dynamics, it addresses a fundamentally different setting where the evolution includes a constant source term. This results in a non-orthogonal-diagonalizable system matrix, rendering classical spectral techniques inapplicable and introducing new challenges in sampling design, stability analysis, and joint recovery of both the initial state and the forcing term. A key component of our analysis is the spectral graph weighted coherence, which characterizes the interplay between the sampling distribution and the graph structure. We establish sampling complexity bounds ensuring stable recovery via the Restricted Isometry Property (RIP), and develop a robust recovery algorithm with provable error guarantees. The effectiveness of our method is validated through extensive experiments on both synthetic and real-world datasets.

LGJun 16, 2024
Guaranteed Sampling Flexibility for Low-tubal-rank Tensor Completion

Bowen Su, Juntao You, HanQin Cai et al.

While Bernoulli sampling is extensively studied in tensor completion, t-CUR sampling approximates low-tubal-rank tensors via lateral and horizontal subtensors. However, both methods lack sufficient flexibility for diverse practical applications. To address this, we introduce Tensor Cross-Concentrated Sampling (t-CCS), a novel and straightforward sampling model that advances the matrix cross-concentrated sampling concept within a tensor framework. t-CCS effectively bridges the gap between Bernoulli and t-CUR sampling, offering additional flexibility that can lead to computational savings in various contexts. A key aspect of our work is the comprehensive theoretical analysis provided. We establish a sufficient condition for the successful recovery of a low-rank tensor from its t-CCS samples. In support of this, we also develop a theoretical framework validating the feasibility of t-CUR via uniform random sampling and conduct a detailed theoretical sampling complexity analysis for tensor completion problems utilizing the general Bernoulli sampling model. Moreover, we introduce an efficient non-convex algorithm, the Iterative t-CUR Tensor Completion (ITCURTC) algorithm, specifically designed to tackle the t-CCS-based tensor completion. We have intensively tested and validated the effectiveness of the t-CCS model and the ITCURTC algorithm across both synthetic and real-world datasets.

MLJun 11, 2024
Accelerating Ill-conditioned Hankel Matrix Recovery via Structured Newton-like Descent

HanQin Cai, Longxiu Huang, Xiliang Lu et al.

This paper studies the robust Hankel recovery problem, which simultaneously removes the sparse outliers and fulfills missing entries from the partial observation. We propose a novel non-convex algorithm, coined Hankel Structured Newton-Like Descent (HSNLD), to tackle the robust Hankel recovery problem. HSNLD is highly efficient with linear convergence, and its convergence rate is independent of the condition number of the underlying Hankel matrix. The recovery guarantee has been established under some mild conditions. Numerical experiments on both synthetic and real datasets show the superior performance of HSNLD against state-of-the-art algorithms.

LGJun 9, 2024
Symmetric Matrix Completion with ReLU Sampling

Huikang Liu, Peng Wang, Longxiu Huang et al.

We study the problem of symmetric positive semi-definite low-rank matrix completion (MC) with deterministic entry-dependent sampling. In particular, we consider rectified linear unit (ReLU) sampling, where only positive entries are observed, as well as a generalization to threshold-based sampling. We first empirically demonstrate that the landscape of this MC problem is not globally benign: Gradient descent (GD) with random initialization will generally converge to stationary points that are not globally optimal. Nevertheless, we prove that when the matrix factor with a small rank satisfies mild assumptions, the nonconvex objective function is geodesically strongly convex on the quotient manifold in a neighborhood of a planted low-rank matrix. Moreover, we show that our assumptions are satisfied by a matrix factor with i.i.d. Gaussian entries. Finally, we develop a tailor-designed initialization for GD to solve our studied formulation, which empirically always achieves convergence to the global minima. We also conduct extensive experiments and compare MC methods, investigating convergence and completion performance with respect to initialization, noise level, dimension, and rank.

NAMay 6, 2023
Robust Tensor CUR Decompositions: Rapid Low-Tucker-Rank Tensor Recovery with Sparse Corruption

HanQin Cai, Zehan Chao, Longxiu Huang et al.

We study the tensor robust principal component analysis (TRPCA) problem, a tensorial extension of matrix robust principal component analysis (RPCA), that aims to split the given tensor into an underlying low-rank component and a sparse outlier component. This work proposes a fast algorithm, called Robust Tensor CUR Decompositions (RTCUR), for large-scale non-convex TRPCA problems under the Tucker rank setting. RTCUR is developed within a framework of alternating projections that projects between the set of low-rank tensors and the set of sparse tensors. We utilize the recently developed tensor CUR decomposition to substantially reduce the computational complexity in each projection. In addition, we develop four variants of RTCUR for different application settings. We demonstrate the effectiveness and computational advantages of RTCUR against state-of-the-art methods on both synthetic and real-world datasets.

LGFeb 28, 2022
Distributed randomized Kaczmarz for the adversarial workers

Xia Li, Longxiu Huang, Deanna Needell

Developing large-scale distributed methods that are robust to the presence of adversarial or corrupted workers is an important part of making such methods practical for real-world problems. Here, we propose an iterative approach that is adversary-tolerant for least-squares problems. The algorithm utilizes simple statistics to guarantee convergence and is capable of learning the adversarial distributions. Additionally, the efficiency of the proposed method is shown in simulations in the presence of adversaries. The results demonstrate the great capability of such methods to tolerate different levels of adversary rates and to identify the erroneous workers with high accuracy.

LGJan 31, 2022
Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents

Pengyu Li, Christine Tseng, Yaxuan Zheng et al.

Classification and topic modeling are popular techniques in machine learning that extract information from large-scale datasets. By incorporating a priori information such as labels or important features, methods have been developed to perform classification and topic modeling tasks; however, most methods that can perform both do not allow for guidance of the topics or features. In this paper, we propose a method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words. We test the performance of this method through its application to legal documents provided by the California Innocence Project, a nonprofit that works to free innocent convicted persons and reform the justice system. The results show that our proposed method improves both classification accuracy and topic coherence in comparison to past methods like Semi-Supervised Non-negative Matrix Factorization (SSNMF) and Guided Non-negative Matrix Factorization (Guided NMF).

LGAug 23, 2021
Fast Robust Tensor Principal Component Analysis via Fiber CUR Decomposition

HanQin Cai, Zehan Chao, Longxiu Huang et al.

We study the problem of tensor robust principal component analysis (TRPCA), which aims to separate an underlying low-multilinear-rank tensor and a sparse outlier tensor from their sum. In this work, we propose a fast non-convex algorithm, coined Robust Tensor CUR (RTCUR), for large-scale TRPCA problems. RTCUR considers a framework of alternating projections and utilizes the recently developed tensor Fiber CUR decomposition to dramatically lower the computational complexity. The performance advantage of RTCUR is empirically verified against the state-of-the-arts on the synthetic datasets and is further demonstrated on the real-world application such as color video background subtraction.

NAMar 19, 2021
Mode-wise Tensor Decompositions: Multi-dimensional Generalizations of CUR Decompositions

HanQin Cai, Keaton Hamm, Longxiu Huang et al.

Low rank tensor approximation is a fundamental tool in modern machine learning and data science. In this paper, we study the characterization, perturbation analysis, and an efficient sampling strategy for two primary tensor CUR approximations, namely Chidori and Fiber CUR. We characterize exact tensor CUR decompositions for low multilinear rank tensors. We also present theoretical error bounds of the tensor CUR approximations when (adversarial or Gaussian) noise appears. Moreover, we show that low cost uniform sampling is sufficient for tensor CUR approximations if the tensor has an incoherent structure. Empirical performance evaluations, with both synthetic and real-world datasets, establish the speed advantage of the tensor CUR approximations over other state-of-the-art low multilinear rank tensor approximations.

CVJan 5, 2021
Robust CUR Decomposition: Theory and Imaging Applications

HanQin Cai, Keaton Hamm, Longxiu Huang et al.

This paper considers the use of Robust PCA in a CUR decomposition framework and applications thereof. Our main algorithms produce a robust version of column-row factorizations of matrices $\mathbf{D}=\mathbf{L}+\mathbf{S}$ where $\mathbf{L}$ is low-rank and $\mathbf{S}$ contains sparse outliers. These methods yield interpretable factorizations at low computational cost, and provide new CUR decompositions that are robust to sparse outliers, in contrast to previous methods. We consider two key imaging applications of Robust PCA: video foreground-background separation and face modeling. This paper examines the qualitative behavior of our Robust CUR decompositions on the benchmark videos and face datasets, and find that our method works as well as standard Robust PCA while being significantly faster. Additionally, we consider hybrid randomized and deterministic sampling methods which produce a compact CUR decomposition of a given matrix, and apply this to video sequences to produce canonical frames thereof.

MLOct 14, 2020
Rapid Robust Principal Component Analysis: CUR Accelerated Inexact Low Rank Estimation

HanQin Cai, Keaton Hamm, Longxiu Huang et al.

Robust principal component analysis (RPCA) is a widely used tool for dimension reduction. In this work, we propose a novel non-convex algorithm, coined Iterated Robust CUR (IRCUR), for solving RPCA problems, which dramatically improves the computational efficiency in comparison with the existing algorithms. IRCUR achieves this acceleration by employing CUR decomposition when updating the low rank component, which allows us to obtain an accurate low rank approximation via only three small submatrices. Consequently, IRCUR is able to process only the small submatrices and avoid expensive computing on the full matrix through the entire algorithm. Numerical experiments establish the computational advantage of IRCUR over the state-of-art algorithms on both synthetic and real-world datasets.

DLSep 7, 2020
COVID-19 Literature Topic-Based Search via Hierarchical NMF

Rachel Grotheer, Yihuan Huang, Pengyu Li et al.

A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure.

NAMar 22, 2019
CUR Decompositions, Approximations, and Perturbations

Keaton Hamm, Longxiu Huang

This article discusses a useful tool in dimensionality reduction and low-rank matrix approximation called the CUR decomposition. Various viewpoints of this method in the literature are synergized and are compared and contrasted; included in this is a new characterization of exact CUR decompositions. A novel perturbation analysis is performed on CUR approximations of noisy versions of low-rank matrices, which compares them with the putative CUR decomposition of the underlying low-rank part. Additionally, we give new column and row sampling results which allow one to conclude that a CUR decomposition of a low-rank matrix is attained with high probability. We then illustrate the stability of these sampling methods under the perturbations studied before, and provide numerical illustrations of the methods and bounds discussed.

NAOct 14, 2018
Dynamical sampling with additive random noise

Akram Aldroubi, Longxiu Huang, Ilya Krishtal et al.

Dynamical sampling deals with signals that evolve in time under the action of a linear operator. The purpose of the present paper is to analyze the performance of the basic dynamical sampling algorithms in the finite dimensional case and study the impact of additive noise. The algorithms are implemented and tested on synthetic and real data sets, and denoising techniques are integrated to mitigate the effect of the noise. We also develop theoretical and numerical results that validate the algorithm for recovering the driving operators, which are defined via a real symmetric convolution.