HanQin Cai

LG
h-index14
35papers
610citations
Novelty53%
AI Score56

35 Papers

LGDec 3, 2022
Laplacian Convolutional Representation for Traffic Time Series Imputation

Xinyu Chen, Zhanhong Cheng, HanQin Cai et al.

Spatiotemporal traffic data imputation is of great significance in intelligent transportation systems and data-driven decision-making processes. To perform efficient learning and accurate reconstruction from partially observed traffic data, we assert the importance of characterizing both global and local trends in time series. In the literature, substantial works have demonstrated the effectiveness of utilizing the low-rank property of traffic data by matrix/tensor completion models. In this study, we first introduce a Laplacian kernel to temporal regularization for characterizing local trends in traffic time series, which can be formulated as a circular convolution. Then, we develop a low-rank Laplacian convolutional representation (LCR) model by putting the circulant matrix nuclear norm and the Laplacian kernelized temporal regularization together, which is proved to meet a unified framework that has a fast Fourier transform (FFT) solution in log-linear time complexity. Through extensive experiments on several traffic datasets, we demonstrate the superiority of LCR over several baseline models for imputing traffic time series of various time series behaviors (e.g., data noises and strong/weak periodicity) and reconstructing sparse speed fields of vehicular traffic flow. The proposed LCR model is also an efficient solution to large-scale traffic data imputation over the existing imputation models.

LGAug 20, 2022
Matrix Completion with Cross-Concentrated Sampling: Bridging Uniform Sampling and CUR Sampling

HanQin Cai, Longxiu Huang, Pengyu Li et al.

While uniform sampling has been widely studied in the matrix completion literature, CUR sampling approximates a low-rank matrix via row and column samples. Unfortunately, both sampling models lack flexibility for various circumstances in real-world applications. In this work, we propose a novel and easy-to-implement sampling strategy, coined Cross-Concentrated Sampling (CCS). By bridging uniform sampling and CUR sampling, CCS provides extra flexibility that can potentially save sampling costs in applications. In addition, we also provide a sufficient condition for CCS-based matrix completion. Moreover, we propose a highly efficient non-convex algorithm, termed Iterative CUR Completion (ICURC), for the proposed CCS model. Numerical experiments verify the empirical advantages of CCS and ICURC against uniform sampling and its baseline algorithms, on both synthetic and real-world datasets.

LGMar 17, 2023
Non-convex approaches for low-rank tensor completion under tubal sampling

Zheng Tan, Longxiu Huang, HanQin Cai et al.

Tensor completion is an important problem in modern data analysis. In this work, we investigate a specific sampling strategy, referred to as tubal sampling. We propose two novel non-convex tensor completion frameworks that are easy to implement, named tensor $L_1$-$L_2$ (TL12) and tensor completion via CUR (TCCUR). We test the efficiency of both methods on synthetic data and a color image inpainting problem. Empirical results reveal a trade-off between the accuracy and time efficiency of these two methods in a low sampling ratio. Each of them outperforms some classical completion methods in at least one aspect.

MLJun 17, 2022
Riemannian CUR Decompositions for Robust Principal Component Analysis

Keaton Hamm, Mohamed Meskini, HanQin Cai

Robust Principal Component Analysis (PCA) has received massive attention in recent years. It aims to recover a low-rank matrix and a sparse matrix from their sum. This paper proposes a novel nonconvex Robust PCA algorithm, coined Riemannian CUR (RieCUR), which utilizes the ideas of Riemannian optimization and robust CUR decompositions. This algorithm has the same computational complexity as Iterated Robust CUR, which is currently state-of-the-art, but is more robust to outliers. RieCUR is also able to tolerate a significant amount of outliers, and is comparable to Accelerated Alternating Projections, which has high outlier tolerance but worse computational complexity than the proposed method. Thus, the proposed algorithm achieves state-of-the-art performance on Robust PCA both in terms of computational complexity and outlier tolerance.

LGAug 16, 2024
S$^3$Attention: Improving Long Sequence Attention with Smoothed Skeleton Sketching

Xue Wang, Tian Zhou, Jianqing Zhu et al.

Attention based models have achieved many remarkable breakthroughs in numerous applications. However, the quadratic complexity of Attention makes the vanilla Attention based models hard to apply to long sequence tasks. Various improved Attention structures are proposed to reduce the computation cost by inducing low rankness and approximating the whole sequence by sub-sequences. The most challenging part of those approaches is maintaining the proper balance between information preservation and computation reduction: the longer sub-sequences used, the better information is preserved, but at the price of introducing more noise and computational costs. In this paper, we propose a smoothed skeleton sketching based Attention structure, coined S$^3$Attention, which significantly improves upon the previous attempts to negotiate this trade-off. S$^3$Attention has two mechanisms to effectively minimize the impact of noise while keeping the linear complexity to the sequence length: a smoothing block to mix information over long sequences and a matrix sketching method that simultaneously selects columns and rows from the input matrix. We verify the effectiveness of S$^3$Attention both theoretically and empirically. Extensive studies over Long Range Arena (LRA) datasets and six time-series forecasting show that S$^3$Attention significantly outperforms both vanilla Attention and other state-of-the-art variants of Attention structures.

LGSep 2, 2024
Correlating Time Series with Interpretable Convolutional Kernels

Xinyu Chen, HanQin Cai, Fuqiang Liu et al.

This study addresses the problem of convolutional kernel learning in univariate, multivariate, and multidimensional time series data, which is crucial for interpreting temporal patterns in time series and supporting downstream machine learning tasks. First, we propose formulating convolutional kernel learning for univariate time series as a sparse regression problem with a non-negative constraint, leveraging the properties of circular convolution and circulant matrices. Second, to generalize this approach to multivariate and multidimensional time series data, we use tensor computations, reformulating the convolutional kernel learning problem in the form of tensors. This is further converted into a standard sparse regression problem through vectorization and tensor unfolding operations. In the proposed methodology, the optimization problem is addressed using the existing non-negative subspace pursuit method, enabling the convolutional kernel to capture temporal correlations and patterns. To evaluate the proposed model, we apply it to several real-world time series datasets. On the multidimensional rideshare and taxi trip data from New York City and Chicago, the convolutional kernels reveal interpretable local correlations and cyclical patterns, such as weekly seasonality. In the context of multidimensional fluid flow data, both local and nonlocal correlations captured by the convolutional kernels can reinforce tensor factorization, leading to performance improvements in fluid flow reconstruction tasks. Thus, this study lays an insightful foundation for automatically learning convolutional kernels from time series data, with an emphasis on interpretability through sparsity and non-negativity constraints.

45.3LGMay 9
TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

Xinyu Chen, HanQin Cai, Lijun Ding et al.

We present TailedTS, a large-scale benchmark dataset derived from Wikipedia hourly page view observations throughout 2024, specifically designed to test time series forecasting models under heavy-tailed, zero-inflated, and non-Gaussian conditions. The dataset comprises approximately 24.69 billion data points spanning roughly 3 million unique Wikipedia pages per month, stored in high-efficiency Apache Parquet format. Wikipedia traffic follows a pronounced power-law distribution where roughly 5% of pages account for over 70% of total page views, creating a natural and rigorous testbed for model robustness against extreme volatility that are absent from or underrepresented in existing benchmarks such as M4, M5, and UCI electricity datasets. TailedTS enables several research tasks. First, we introduce a periodicity quantification framework based on sparse autoregression with sparsity and non-negativity constraints, revealing that frequently-viewed pages exhibit significantly weaker periodic structure than their less-viewed counterparts, showing direct implications for server allocation and traffic forecasting on large digital platforms. Second, we provide standardized prediction benchmarks evaluated under a suite of non-Gaussian loss functions, including $\ell_1$-norm, Huber, quantile, and $\ell_p$-norm losses, demonstrating that standard Gaussian-based estimators degrade substantially on high-volume page categories, while robust alternatives provide consistent gains across all traffic scales. TailedTS is publicly available at https://doi.org/10.5281/zenodo.17070469.

31.6LGMay 7
Memory Efficient Full-gradient Attacks (MEFA) Framework for Adversarial Defense Evaluations

Yuan Du, Mitchel Hill, HanQin Cai

This work studies the robust evaluation of iterative stochastic purification defenses under white-box adversarial attacks. Our key technical insight is that gradient checkpointing makes exact end-to-end gradient computation through long purification trajectories practical by trading additional recomputation for substantially lower memory usage. This enables full-gradient adaptive attacks against diffusion- and Langevin-based purification defenses, where prior evaluations often resort to approximate backpropagation due to memory constraints. These approximations can weaken the attack signal and risk overestimating robustness. In parallel, stochasticity in iterative purification is frequently under-controlled, even though different purification trajectories can substantially change reported robustness metrics. Building on this insight, we introduce a memory-efficient full-gradient evaluation framework for stochastic purification defenses. The framework combines checkpointed backpropagation with evaluation protocols that control stochastic variability, thereby reducing memory bottlenecks while preserving exact gradients. We evaluate diffusion-based purification and Langevin sampling with Energy-Based Models (EBMs), demonstrating that full-gradient attacks uncover vulnerabilities missed by approximate-gradient evaluations. Our framework yields stronger state-of-the-art $\ell_{\infty}$ and $\ell_{2}$ white-box attacks and further supports probing out-of-distribution robustness. Overall, our results show that exact-gradient evaluation is essential for reliable benchmarking of iterative stochastic defenses.

LGDec 31, 2024
Deeply Learned Robust Matrix Completion for Large-scale Low-rank Data Recovery

HanQin Cai, Chandra Kundu, Jialin Liu et al.

Robust matrix completion (RMC) is a widely used machine learning tool that simultaneously tackles two critical issues in low-rank data analysis: missing data entries and extreme outliers. This paper proposes a novel scalable and learnable non-convex approach, coined Learned Robust Matrix Completion (LRMC), for large-scale RMC problems. LRMC enjoys low computational complexity with linear convergence. Motivated by the proposed theorem, the free parameters of LRMC can be effectively learned via deep unfolding to achieve optimum performance. Furthermore, this paper proposes a flexible feedforward-recurrent-mixed neural network framework that extends deep unfolding from fix-number iterations to infinite iterations. The superior empirical performance of LRMC is verified with extensive experiments against state-of-the-art on synthetic datasets and real applications, including video background subtraction, ultrasound imaging, face modeling, and cloud removal from satellite imagery.

OCOct 22, 2024
Guarantees of a Preconditioned Subgradient Algorithm for Overparameterized Asymmetric Low-rank Matrix Recovery

Paris Giampouras, HanQin Cai, Rene Vidal

In this paper, we focus on a matrix factorization-based approach to recover low-rank {\it asymmetric} matrices from corrupted measurements. We propose an {\it Overparameterized Preconditioned Subgradient Algorithm (OPSA)} and provide, for the first time in the literature, linear convergence rates independent of the rank of the sought asymmetric matrix in the presence of gross corruptions. Our work goes beyond existing results in preconditioned-type approaches addressing their current limitation, i.e., the lack of convergence guarantees in the case of {\it asymmetric matrices of unknown rank}. By applying our approach to (robust) matrix sensing, we highlight its merits when the measurement operator satisfies a mixed-norm restricted isometry property. Lastly, we present extensive numerical experiments that validate our theoretical results and demonstrate the effectiveness of our approach for different levels of overparameterization and outlier corruptions.

MLJan 28, 2024
On the Robustness of Cross-Concentrated Sampling for Matrix Completion

HanQin Cai, Longxiu Huang, Chandra Kundu et al.

Matrix completion is one of the crucial tools in modern data science research. Recently, a novel sampling model for matrix completion coined cross-concentrated sampling (CCS) has caught much attention. However, the robustness of the CCS model against sparse outliers remains unclear in the existing studies. In this paper, we aim to answer this question by exploring a novel Robust CCS Completion problem. A highly efficient non-convex iterative algorithm, dubbed Robust CUR Completion (RCURC), is proposed. The empirical performance of the proposed algorithm, in terms of both efficiency and robustness, is verified in synthetic and real datasets.

SPFeb 4, 2025
Three-dimensional signal processing: a new approach in dynamical sampling via tensor products

Yisen Wang, Hanqin Cai, Longxiu Huang

The dynamical sampling problem is centered around reconstructing signals that evolve over time according to a dynamical process, from spatial-temporal samples that may be noisy. This topic has been thoroughly explored for one-dimensional signals. Multidimensional signal recovery has also been studied, but primarily in scenarios where the driving operator is a convolution operator. In this work, we shift our focus to the dynamical sampling problem in the context of three-dimensional signal recovery, where the evolution system can be characterized by tensor products. Specifically, we provide a necessary condition for the sampling set that ensures successful recovery of the three-dimensional signal. Furthermore, we reformulate the reconstruction problem as an optimization task, which can be solved efficiently. To demonstrate the effectiveness of our approach, we include some straightforward numerical simulations that showcase the reconstruction performance.

23.0ITApr 10
Robust Spectral Recovery for Dynamical Sampling

HanQin Cai, Longxiu Huang, Tianming Wang et al.

We study the spectral recovery problem for dynamical sampling on a finite cyclic grid. Given time snapshots obtained from a fixed uniform spatial subsampling of the orbit $x_{\ell}=A^{\ell}f$, we aim to recover the spectrum of the unknown circular convolution operator $A$. However, in the presence of outliers, even in only a few snapshots, existing approaches often struggle to recover the spectrum. We address this challenge by proposing a novel robust spectral recovery model in the presence of time-sparse corruptions. We propose a robust pipeline that lifts the problem to a sequence of robust low-rank Hankel recovery and completion tasks, followed by Prony-type spectral estimation. Numerical experiments confirm the accurate spectral recovery of the proposed approach and exhibit its superior robustness against state-of-the-art under various settings.

LGDec 14, 2024
Structured Sampling for Robust Euclidean Distance Geometry

Chandra Kundu, Abiy Tasissa, HanQin Cai

This paper addresses the problem of estimating the positions of points from distance measurements corrupted by sparse outliers. Specifically, we consider a setting with two types of nodes: anchor nodes, for which exact distances to each other are known, and target nodes, for which complete but corrupted distance measurements to the anchors are available. To tackle this problem, we propose a novel algorithm powered by Nyström method and robust principal component analysis. Our method is computationally efficient as it processes only a localized subset of the distance matrix and does not require distance measurements between target nodes. Empirical evaluations on synthetic datasets, designed to mimic sensor localization, and on molecular experiments, demonstrate that our algorithm achieves accurate recovery with a modest number of anchors, even in the presence of high levels of sparse outliers.

MLSep 23, 2025
Recovering Wasserstein Distance Matrices from Few Measurements

Muhammad Rana, Abiy Tasissa, HanQin Cai et al.

This paper proposes two algorithms for estimating square Wasserstein distance matrices from a small number of entries. These matrices are used to compute manifold learning embeddings like multidimensional scaling (MDS) or Isomap, but contrary to Euclidean distance matrices, are extremely costly to compute. We analyze matrix completion from upper triangular samples and Nyström completion in which $\mathcal{O}(d\log(d))$ columns of the distance matrices are computed where $d$ is the desired embedding dimension, prove stability of MDS under Nyström completion, and show that it can outperform matrix completion for a fixed budget of sample distances. Finally, we show that classification of the OrganCMNIST dataset from the MedMNIST benchmark is stable on data embedded from the Nyström estimation of the distance matrix even when only 10\% of the columns are computed.

SIAug 2, 2025
Data-Driven Discovery of Mobility Periodicity for Understanding Urban Systems

Xinyu Chen, Qi Wang, Yunhan Zheng et al.

Human mobility regularity is crucial for understanding urban dynamics and informing decision-making processes. This study first quantifies the periodicity in complex human mobility data as a sparse identification of dominant positive auto-correlations in time series autoregression and then discovers periodic patterns. We apply the framework to large-scale metro passenger flow data in Hangzhou, China and multi-modal mobility data in New York City and Chicago, USA, revealing the interpretable weekly periodicity across different spatial locations over past several years. The analysis of ridesharing data from 2019 to 2024 demonstrates the disruptive impact of the pandemic on mobility regularity and the subsequent recovery trends. In 2024, the periodic mobility patterns of ridesharing, taxi, subway, and bikesharing in Manhattan uncover the regularity and variability of these travel modes. Our findings highlight the potential of interpretable machine learning to discover spatiotemporal mobility patterns and offer a valuable tool for understanding urban systems.

OCJul 31, 2025
Provable Non-Convex Euclidean Distance Matrix Completion: Geometry, Reconstruction, and Robustness

Chandler Smith, HanQin Cai, Abiy Tasissa

The problem of recovering the configuration of points from their partial pairwise distances, referred to as the Euclidean Distance Matrix Completion (EDMC) problem, arises in a broad range of applications, including sensor network localization, molecular conformation, and manifold learning. In this paper, we propose a Riemannian optimization framework for solving the EDMC problem by formulating it as a low-rank matrix completion task over the space of positive semi-definite Gram matrices. The available distance measurements are encoded as expansion coefficients in a non-orthogonal basis, and optimization over the Gram matrix implicitly enforces geometric consistency through nonnegativity and the triangle inequality, a structure inherited from classical multidimensional scaling. Under a Bernoulli sampling model for observed distances, we prove that Riemannian gradient descent on the manifold of rank-$r$ matrices locally converges linearly with high probability when the sampling probability satisfies $p\geq O(ν^2 r^2\log(n)/n)$, where $ν$ is an EDMC-specific incoherence parameter. Furthermore, we provide an initialization candidate using a one-step hard thresholding procedure that yields convergence, provided the sampling probability satisfies $p \geq O(νr^{3/2}\log^{3/4}(n)/n^{1/4})$. A key technical contribution of this work is the analysis of a symmetric linear operator arising from a dual basis expansion in the non-orthogonal basis, which requires a novel application of the Hanson-Wright inequality to establish an optimal restricted isometry property in the presence of coupled terms. Empirical evaluations on synthetic data demonstrate that our algorithm achieves competitive performance relative to state-of-the-art methods. Moreover, we provide a geometric interpretation of matrix incoherence tailored to the EDMC setting and provide robustness guarantees for our method.

LGMay 23, 2025
A Dual Basis Approach for Structured Robust Euclidean Distance Geometry

Chandra Kundu, Abiy Tasissa, HanQin Cai

Euclidean Distance Matrix (EDM), which consists of pairwise squared Euclidean distances of a given point configuration, finds many applications in modern machine learning. This paper considers the setting where only a set of anchor nodes is used to collect the distances between themselves and the rest. In the presence of potential outliers, it results in a structured partial observation on EDM with partial corruptions. Note that an EDM can be connected to a positive semi-definite Gram matrix via a non-orthogonal dual basis. Inspired by recent development of non-orthogonal dual basis in optimization, we propose a novel algorithmic framework, dubbed Robust Euclidean Distance Geometry via Dual Basis (RoDEoDB), for recovering the Euclidean distance geometry, i.e., the underlying point configuration. The exact recovery guarantees have been established in terms of both the Gram matrix and point configuration, under some mild conditions. Empirical experiments show superior performance of RoDEoDB on sensor localization and molecular conformation datasets.

CVJan 19, 2025
Explainable Adversarial Attacks on Coarse-to-Fine Classifiers

Akram Heidarizadeh, Connor Hatfield, Lorenzo Lazzarotto et al.

Traditional adversarial attacks typically aim to alter the predicted labels of input images by generating perturbations that are imperceptible to the human eye. However, these approaches often lack explainability. Moreover, most existing work on adversarial attacks focuses on single-stage classifiers, but multi-stage classifiers are largely unexplored. In this paper, we introduce instance-based adversarial attacks for multi-stage classifiers, leveraging Layer-wise Relevance Propagation (LRP), which assigns relevance scores to pixels based on their influence on classification outcomes. Our approach generates explainable adversarial perturbations by utilizing LRP to identify and target key features critical for both coarse and fine-grained classifications. Unlike conventional attacks, our method not only induces misclassification but also enhances the interpretability of the model's behavior across classification stages, as demonstrated by experimental results.

IVNov 25, 2024
WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing

Kai Han, Jin Wang, Yunhui Shi et al.

Deep unfolding networks have gained increasing attention in the field of compressed sensing (CS) owing to their theoretical interpretability and superior reconstruction performance. However, most existing deep unfolding methods often face the following issues: 1) they learn directly from single-channel images, leading to a simple feature representation that does not fully capture complex features; and 2) they treat various image components uniformly, ignoring the characteristics of different components. To address these issues, we propose a novel wavelet-domain deep unfolding framework named WTDUN, which operates directly on the multi-scale wavelet subbands. Our method utilizes the intrinsic sparsity and multi-scale structure of wavelet coefficients to achieve a tree-structured sampling and reconstruction, effectively capturing and highlighting the most important features within images. Specifically, the design of tree-structured reconstruction aims to capture the inter-dependencies among the multi-scale subbands, enabling the identification of both fine and coarse features, which can lead to a marked improvement in reconstruction quality. Furthermore, a wavelet domain adaptive sampling method is proposed to greatly improve the sampling capability, which is realized by assigning measurements to each wavelet subband based on its importance. Unlike pure deep learning methods that treat all components uniformly, our method introduces a targeted focus on important subbands, considering their energy and sparsity. This targeted strategy lets us capture key information more efficiently while discarding less important information, resulting in a more effective and detailed reconstruction. Extensive experimental results on various datasets validate the superior performance of our proposed method.

LGJun 16, 2024
Guaranteed Sampling Flexibility for Low-tubal-rank Tensor Completion

Bowen Su, Juntao You, HanQin Cai et al.

While Bernoulli sampling is extensively studied in tensor completion, t-CUR sampling approximates low-tubal-rank tensors via lateral and horizontal subtensors. However, both methods lack sufficient flexibility for diverse practical applications. To address this, we introduce Tensor Cross-Concentrated Sampling (t-CCS), a novel and straightforward sampling model that advances the matrix cross-concentrated sampling concept within a tensor framework. t-CCS effectively bridges the gap between Bernoulli and t-CUR sampling, offering additional flexibility that can lead to computational savings in various contexts. A key aspect of our work is the comprehensive theoretical analysis provided. We establish a sufficient condition for the successful recovery of a low-rank tensor from its t-CCS samples. In support of this, we also develop a theoretical framework validating the feasibility of t-CUR via uniform random sampling and conduct a detailed theoretical sampling complexity analysis for tensor completion problems utilizing the general Bernoulli sampling model. Moreover, we introduce an efficient non-convex algorithm, the Iterative t-CUR Tensor Completion (ITCURTC) algorithm, specifically designed to tackle the t-CCS-based tensor completion. We have intensively tested and validated the effectiveness of the t-CCS model and the ITCURTC algorithm across both synthetic and real-world datasets.

MLJun 11, 2024
Accelerating Ill-conditioned Hankel Matrix Recovery via Structured Newton-like Descent

HanQin Cai, Longxiu Huang, Xiliang Lu et al.

This paper studies the robust Hankel recovery problem, which simultaneously removes the sparse outliers and fulfills missing entries from the partial observation. We propose a novel non-convex algorithm, coined Hankel Structured Newton-Like Descent (HSNLD), to tackle the robust Hankel recovery problem. HSNLD is highly efficient with linear convergence, and its convergence rate is independent of the condition number of the underlying Hankel matrix. The recovery guarantee has been established under some mild conditions. Numerical experiments on both synthetic and real datasets show the superior performance of HSNLD against state-of-the-art algorithms.

LGMay 29, 2023
Towards Constituting Mathematical Structures for Learning to Optimize

Jialin Liu, Xiaohan Chen, Zhangyang Wang et al.

Learning to Optimize (L2O), a technique that utilizes machine learning to learn an optimization algorithm automatically from data, has gained arising attention in recent years. A generic L2O approach parameterizes the iterative update rule and learns the update direction as a black-box network. While the generic approach is widely applicable, the learned model can overfit and may not generalize well to out-of-distribution test sets. In this paper, we derive the basic mathematical conditions that successful update rules commonly satisfy. Consequently, we propose a novel L2O model with a mathematics-inspired structure that is broadly applicable and generalized well to out-of-distribution problems. Numerical simulations validate our theoretical findings and demonstrate the superior empirical performance of the proposed L2O model.

NAMay 6, 2023
Robust Tensor CUR Decompositions: Rapid Low-Tucker-Rank Tensor Recovery with Sparse Corruption

HanQin Cai, Zehan Chao, Longxiu Huang et al.

We study the tensor robust principal component analysis (TRPCA) problem, a tensorial extension of matrix robust principal component analysis (RPCA), that aims to split the given tensor into an underlying low-rank component and a sparse outlier component. This work proposes a fast algorithm, called Robust Tensor CUR Decompositions (RTCUR), for large-scale non-convex TRPCA problems under the Tucker rank setting. RTCUR is developed within a framework of alternating projections that projects between the set of low-rank tensors and the set of sparse tensors. We utilize the recently developed tensor CUR decomposition to substantially reduce the computational complexity in each projection. In addition, we develop four variants of RTCUR for different application settings. We demonstrate the effectiveness and computational advantages of RTCUR against state-of-the-art methods on both synthetic and real-world datasets.

LGOct 11, 2021
Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection

HanQin Cai, Jialin Liu, Wotao Yin

Robust principal component analysis (RPCA) is a critical tool in modern machine learning, which detects outliers in the task of low-rank matrix reconstruction. In this paper, we propose a scalable and learnable non-convex approach for high-dimensional RPCA problems, which we call Learned Robust PCA (LRPCA). LRPCA is highly efficient, and its free parameters can be effectively learned to optimize via deep unfolding. Moreover, we extend deep unfolding from finite iterations to infinite iterations via a novel feedforward-recurrent-mixed neural network model. We establish the recovery guarantee of LRPCA under mild assumptions for RPCA. Numerical experiments show that LRPCA outperforms the state-of-the-art RPCA algorithms, such as ScaledGD and AltProj, on both synthetic datasets and real-world applications.

OCSep 27, 2021
Curvature-Aware Derivative-Free Optimization

Bumsu Kim, HanQin Cai, Daniel McKenzie et al.

The paper discusses derivative-free optimization (DFO), which involves minimizing a function without access to gradients or directional derivatives, only function evaluations. Classical DFO methods, which mimic gradient-based methods, such as Nelder-Mead and direct search have limited scalability for high-dimensional problems. Zeroth-order methods have been gaining popularity due to the demands of large-scale machine learning applications, and the paper focuses on the selection of the step size $α_k$ in these methods. The proposed approach, called Curvature-Aware Random Search (CARS), uses first- and second-order finite difference approximations to compute a candidate $α_{+}$. We prove that for strongly convex objective functions, CARS converges linearly provided that the search direction is drawn from a distribution satisfying very mild conditions. We also present a Cubic Regularized variant of CARS, named CARS-CR, which converges in a rate of $\mathcal{O}(k^{-1})$ without the assumption of strong convexity. Numerical experiments show that CARS and CARS-CR match or exceed the state-of-the-arts on benchmark problem sets.

LGAug 23, 2021
Fast Robust Tensor Principal Component Analysis via Fiber CUR Decomposition

HanQin Cai, Zehan Chao, Longxiu Huang et al.

We study the problem of tensor robust principal component analysis (TRPCA), which aims to separate an underlying low-multilinear-rank tensor and a sparse outlier tensor from their sum. In this work, we propose a fast non-convex algorithm, coined Robust Tensor CUR (RTCUR), for large-scale TRPCA problems. RTCUR considers a framework of alternating projections and utilizes the recently developed tensor Fiber CUR decomposition to dramatically lower the computational complexity. The performance advantage of RTCUR is empirically verified against the state-of-the-arts on the synthetic datasets and is further demonstrated on the real-world application such as color video background subtraction.

NAMar 19, 2021
Mode-wise Tensor Decompositions: Multi-dimensional Generalizations of CUR Decompositions

HanQin Cai, Keaton Hamm, Longxiu Huang et al.

Low rank tensor approximation is a fundamental tool in modern machine learning and data science. In this paper, we study the characterization, perturbation analysis, and an efficient sampling strategy for two primary tensor CUR approximations, namely Chidori and Fiber CUR. We characterize exact tensor CUR decompositions for low multilinear rank tensors. We also present theoretical error bounds of the tensor CUR approximations when (adversarial or Gaussian) noise appears. Moreover, we show that low cost uniform sampling is sufficient for tensor CUR approximations if the tensor has an incoherent structure. Empirical performance evaluations, with both synthetic and real-world datasets, establish the speed advantage of the tensor CUR approximations over other state-of-the-art low multilinear rank tensor approximations.

OCFeb 21, 2021
A Zeroth-Order Block Coordinate Descent Algorithm for Huge-Scale Black-Box Optimization

HanQin Cai, Yuchen Lou, Daniel McKenzie et al.

We consider the zeroth-order optimization problem in the huge-scale setting, where the dimension of the problem is so large that performing even basic vector operations on the decision variables is infeasible. In this paper, we propose a novel algorithm, coined ZO-BCD, that exhibits favorable overall query complexity and has a much smaller per-iteration computational complexity. In addition, we discuss how the memory footprint of ZO-BCD can be reduced even further by the clever use of circulant measurement matrices. As an application of our new method, we propose the idea of crafting adversarial attacks on neural network based classifiers in a wavelet domain, which can result in problem dimensions of over 1.7 million. In particular, we show that crafting adversarial examples to audio classifiers in a wavelet domain can achieve the state-of-the-art attack success rate of 97.9%.

CVJan 5, 2021
Robust CUR Decomposition: Theory and Imaging Applications

HanQin Cai, Keaton Hamm, Longxiu Huang et al.

This paper considers the use of Robust PCA in a CUR decomposition framework and applications thereof. Our main algorithms produce a robust version of column-row factorizations of matrices $\mathbf{D}=\mathbf{L}+\mathbf{S}$ where $\mathbf{L}$ is low-rank and $\mathbf{S}$ contains sparse outliers. These methods yield interpretable factorizations at low computational cost, and provide new CUR decompositions that are robust to sparse outliers, in contrast to previous methods. We consider two key imaging applications of Robust PCA: video foreground-background separation and face modeling. This paper examines the qualitative behavior of our Robust CUR decompositions on the benchmark videos and face datasets, and find that our method works as well as standard Robust PCA while being significantly faster. Additionally, we consider hybrid randomized and deterministic sampling methods which produce a compact CUR decomposition of a given matrix, and apply this to video sequences to produce canonical frames thereof.

MLOct 14, 2020
Rapid Robust Principal Component Analysis: CUR Accelerated Inexact Low Rank Estimation

HanQin Cai, Keaton Hamm, Longxiu Huang et al.

Robust principal component analysis (RPCA) is a widely used tool for dimension reduction. In this work, we propose a novel non-convex algorithm, coined Iterated Robust CUR (IRCUR), for solving RPCA problems, which dramatically improves the computational efficiency in comparison with the existing algorithms. IRCUR achieves this acceleration by employing CUR decomposition when updating the low rank component, which allows us to obtain an accurate low rank approximation via only three small submatrices. Consequently, IRCUR is able to process only the small submatrices and avoid expensive computing on the full matrix through the entire algorithm. Numerical experiments establish the computational advantage of IRCUR over the state-of-art algorithms on both synthetic and real-world datasets.

OCOct 6, 2020
A One-bit, Comparison-Based Gradient Estimator

HanQin Cai, Daniel Mckenzie, Wotao Yin et al.

We study zeroth-order optimization for convex functions where we further assume that function evaluations are unavailable. Instead, one only has access to a $\textit{comparison oracle}$, which given two points $x$ and $y$ returns a single bit of information indicating which point has larger function value, $f(x)$ or $f(y)$. By treating the gradient as an unknown signal to be recovered, we show how one can use tools from one-bit compressed sensing to construct a robust and reliable estimator of the normalized gradient. We then propose an algorithm, coined SCOBO, that uses this estimator within a gradient descent scheme. We show that when $f(x)$ has some low dimensional structure that can be exploited, SCOBO outperforms the state-of-the-art in terms of query complexity. Our theoretical claims are verified by extensive numerical experiments.

OCMar 29, 2020
Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling

HanQin Cai, Daniel Mckenzie, Wotao Yin et al.

We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using (possibly noisy) evaluations of the function. Such optimization is also called derivative-free, zeroth-order, or black-box optimization. We propose a new $\textbf{Z}$eroth-$\textbf{O}$rder $\textbf{R}$egularized $\textbf{O}$ptimization method, dubbed ZORO. When the underlying gradient is approximately sparse at an iterate, ZORO needs very few objective function evaluations to obtain a new iterate that decreases the objective function. We achieve this with an adaptive, randomized gradient estimator, followed by an inexact proximal-gradient scheme. Under a novel approximately sparse gradient assumption and various different convex settings, we show the (theoretical and empirical) convergence rate of ZORO is only logarithmically dependent on the problem dimension. Numerical experiments show that ZORO outperforms the existing methods with similar assumptions, on both synthetic and real datasets.

ITOct 13, 2019
Accelerated Structured Alternating Projections for Robust Spectrally Sparse Signal Recovery

HanQin Cai, Jian-Feng Cai, Tianming Wang et al.

Consider a spectrally sparse signal $\boldsymbol{x}$ that consists of $r$ complex sinusoids with or without damping. We study the robust recovery problem for the spectrally sparse signal under the fully observed setting, which is about recovering $\boldsymbol{x}$ and a sparse corruption vector $\boldsymbol{s}$ from their sum $\boldsymbol{z}=\boldsymbol{x}+\boldsymbol{s}$. In this paper, we exploit the low-rank property of the Hankel matrix formed by $\boldsymbol{x}$, and formulate the problem as the robust recovery of a corrupted low-rank Hankel matrix. We develop a highly efficient non-convex algorithm, coined Accelerated Structured Alternating Projections (ASAP). The high computational efficiency and low space complexity of ASAP are achieved by fast computations involving structured matrices, and a subspace projection method for accelerated low-rank approximation. Theoretical recovery guarantee with a linear convergence rate has been established for ASAP, under some mild assumptions on $\boldsymbol{x}$ and $\boldsymbol{s}$. Empirical performance comparisons on both synthetic and real-world data confirm the advantages of ASAP, in terms of computational efficiency and robustness aspects.

ITNov 15, 2017
Accelerated Alternating Projections for Robust Principal Component Analysis

HanQin Cai, Jian-Feng Cai, Ke Wei

We study robust PCA for the fully observed setting, which is about separating a low rank matrix $\boldsymbol{L}$ and a sparse matrix $\boldsymbol{S}$ from their sum $\boldsymbol{D}=\boldsymbol{L}+\boldsymbol{S}$. In this paper, a new algorithm, dubbed accelerated alternating projections, is introduced for robust PCA which significantly improves the computational efficiency of the existing alternating projections proposed in [Netrapalli, Praneeth, et al., 2014] when updating the low rank factor. The acceleration is achieved by first projecting a matrix onto some low dimensional subspace before obtaining a new estimate of the low rank matrix via truncated SVD. Exact recovery guarantee has been established which shows linear convergence of the proposed algorithm. Empirical performance evaluations establish the advantage of our algorithm over other state-of-the-art algorithms for robust PCA.