Joel A. Tropp

h-index56

18papers

10,748citations

Novelty53%

AI Score51

Ranked #18,420 of 194,257 authors (top 9%)#29 in NA (top 1%)

18 Papers

24.1NAApr 17, 2008

CoSaMP: Iterative signal recovery from incomplete and inaccurate samples

D. Needell, J. A. Tropp

Compressive sampling offers a new paradigm for acquiring signals that are compressible with respect to an orthonormal basis. The major algorithmic challenge in compressive sampling is to approximate a compressible signal from noisy samples. This paper describes a new iterative recovery algorithm called CoSaMP that delivers the same guarantees as the best optimization-based approaches. Moreover, this algorithm offers rigorous bounds on computational cost and storage. It is likely to be extremely efficient for practical problems because it requires only matrix-vector multiplies with the sampling matrix. For many cases of interest, the running time is just O(N*log^2(N)), where N is the length of the signal.

12.2NAJul 17, 2011

Improved analysis of the subsampled randomized Hadamard transform

Joel A. Tropp

This paper presents an improved analysis of a structured dimension-reduction map called the subsampled randomized Hadamard transform. This argument demonstrates that the map preserves the Euclidean geometry of an entire subspace of vectors. The new proof is much simpler than previous approaches, and it offers---for the first time---optimal constants in the estimate on the number of dimensions required for the embedding.

7.3NAJun 26, 2008

Column Subset Selection, Matrix Factorization, and Eigenvalue Optimization

Joel A. Tropp

Given a fixed matrix, the problem of column subset selection requests a column submatrix that has favorable spectral properties. Most research from the algorithms and numerical linear algebra communities focuses on a variant called rank-revealing {\sf QR}, which seeks a well-conditioned collection of columns that spans the (numerical) range of the matrix. The functional analysis literature contains another strand of work on column selection whose algorithmic implications have not been explored. In particular, a celebrated result of Bourgain and Tzafriri demonstrates that each matrix with normalized columns contains a large column submatrix that is exceptionally well conditioned. Unfortunately, standard proofs of this result cannot be regarded as algorithmic. This paper presents a randomized, polynomial-time algorithm that produces the submatrix promised by Bourgain and Tzafriri. The method involves random sampling of columns, followed by a matrix factorization that exposes the well-conditioned subset of columns. This factorization, which is due to Grothendieck, is regarded as a central tool in modern functional analysis. The primary novelty in this work is an algorithm, based on eigenvalue minimization, for constructing the Grothendieck factorization. These ideas also result in a novel approximation algorithm for the $(\infty, 1)$ norm of a matrix, which is generally {\sf NP}-hard to compute exactly. As an added bonus, this work reveals a surprising connection between matrix factorization and the famous {\sc maxcut} semidefinite program.

1.2NAFeb 22, 2019

Streaming Low-Rank Matrix Approximation with an Application to Scientific Simulation

Joel A. Tropp, Alp Yurtsever, Madeleine Udell et al.

This paper argues that randomized linear sketching is a natural tool for on-the-fly compression of data matrices that arise from large-scale scientific simulations and data collection. The technical contribution consists in a new algorithm for constructing an accurate low-rank approximation of a matrix from streaming data. This method is accompanied by an a priori analysis that allows the user to set algorithm parameters with confidence and an a posteriori error estimator that allows the user to validate the quality of the reconstructed matrix. In comparison to previous techniques, the new method achieves smaller relative approximation errors and is less sensitive to parameter choices. As concrete applications, the paper outlines how the algorithm can be used to compress a Navier--Stokes simulation and a sea surface temperature dataset.

1.2NADec 11, 2008

Greedy Signal Recovery Review

D. Needell, J. A. Tropp, R. Vershynin

The two major approaches to sparse recovery are L1-minimization and greedy methods. Recently, Needell and Vershynin developed Regularized Orthogonal Matching Pursuit (ROMP) that has bridged the gap between these two approaches. ROMP is the first stable greedy algorithm providing uniform guarantees. Even more recently, Needell and Tropp developed the stable greedy algorithm Compressive Sampling Matching Pursuit (CoSaMP). CoSaMP provides uniform guarantees and improves upon the stability bounds and RIC requirements of ROMP. CoSaMP offers rigorous bounds on computational cost and storage. In many cases, the running time is just O(NlogN), where N is the ambient dimension of the signal. This review summarizes these major advances.

8.5PRMar 14

Comparison theorems for the extreme eigenvalues of a random symmetric matrix

Joel A. Tropp

This paper establishes a comparison theorem for the maximum eigenvalue of a sum of independent random symmetric matrices. The theorem states that the maximum eigenvalue of the matrix sum is dominated by the maximum eigenvalue of a Gaussian random matrix that inherits its statistics from the sum, and it strengthens previous results of this type. Corollaries address the minimum eigenvalue and the spectral norm. The comparison methodology is powerful because of the vast arsenal of tools for treating Gaussian random matrices. As applications, the paper improves on existing eigenvalue bounds for random matrices arising in spectral graph theory, quantum information theory, high-dimensional statistics, and numerical linear algebra. In particular, these techniques deliver the first complete proof that a sparse random dimension reduction map has the injectivity properties conjectured by Nelson & Nguyen in 2013.

6.6PRApr 29

Applied Random Matrix Theory

Joel A. Tropp

Random matrices now play a role in many parts of computational mathematics. To advance these applications, it is desirable to have tools that are flexible, easy to use, and powerful. Over the last 25 years, researchers have developed a remarkable family of results, called matrix concentration inequalities, that meet the criteria. This paper offers an invitation to the field of matrix concentration and its multifarious applications.

11.7DSSep 20, 2021Code

Learning to Forecast Dynamical Systems from Streaming Data

Dimitris Giannakis, Amelia Henriksen, Joel A. Tropp et al.

Kernel analog forecasting (KAF) is a powerful methodology for data-driven, non-parametric forecasting of dynamically generated time series data. This approach has a rigorous foundation in Koopman operator theory and it produces good forecasts in practice, but it suffers from the heavy computational costs common to kernel methods. This paper proposes a streaming algorithm for KAF that only requires a single pass over the training data. This algorithm dramatically reduces the costs of training and prediction without sacrificing forecasting skill. Computational experiments demonstrate that the streaming KAF method can successfully forecast several classes of dynamical systems (periodic, quasi-periodic, and chaotic) in both data-scarce and data-rich regimes. The overall methodology may have wider interest as a new template for streaming kernel regression.

13.0NAApr 30, 2021

Tensor Random Projection for Low Memory Dimension Reduction

Yiming Sun, Yang Guo, Joel A. Tropp et al.

Random projections reduce the dimension of a set of vectors while preserving structural information, such as distances between vectors in the set. This paper proposes a novel use of row-product random matrices in random projection, where we call it Tensor Random Projection (TRP). It requires substantially less memory than existing dimension reduction maps. The TRP map is formed as the Khatri-Rao product of several smaller random projections, and is compatible with any base random projection including sparse maps, which enable dimension reduction with very low query cost and no floating point operations. We also develop a reduced variance extension. We provide a theoretical analysis of the bias and variance of the TRP, and a non-asymptotic error analysis for a TRP composed of two smaller maps. Experiments on both synthetic and MNIST data show that our method performs as well as conventional methods with substantially less storage.

17.8NAApr 24, 2019Code

Low-Rank Tucker Approximation of a Tensor From Streaming Data

Yiming Sun, Yang Guo, Charlene Luo et al.

This paper describes a new algorithm for computing a low-Tucker-rank approximation of a tensor. The method applies a randomized linear map to the tensor to obtain a sketch that captures the important directions within each mode, as well as the interactions among the modes. The sketch can be extracted from streaming or distributed data or with a single pass over the tensor, and it uses storage proportional to the degrees of freedom in the output Tucker approximation. The algorithm does not require a second pass over the tensor, although it can exploit another view to compute a superior approximation. The paper provides a rigorous theoretical guarantee on the approximation error. Extensive numerical experiments show that that the algorithm produces useful results that improve on the state of the art for streaming Tucker decomposition.

14.1OCFeb 9, 2019

An Optimal-Storage Approach to Semidefinite Programming using Approximate Complementarity

Lijun Ding, Alp Yurtsever, Volkan Cevher et al.

This paper develops a new storage-optimal algorithm that provably solves generic semidefinite programs (SDPs) in standard form. This method is particularly effective for weakly constrained SDPs. The key idea is to formulate an approximate complementarity principle: Given an approximate solution to the dual SDP, the primal SDP has an approximate solution whose range is contained in the eigenspace with small eigenvalues of the dual slack matrix. For weakly constrained SDPs, this eigenspace has very low dimension, so this observation significantly reduces the search space for the primal solution. This result suggests an algorithmic strategy that can be implemented with minimal storage: (1) Solve the dual SDP approximately; (2) compress the primal SDP to the eigenspace with small eigenvalues of the dual slack matrix; (3) solve the compressed primal SDP. The paper also provides numerical experiments showing that this approach is successful for a range of interesting large-scale SDPs.

17.3NAJun 18, 2017

Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data

Joel A. Tropp, Alp Yurtsever, Madeleine Udell et al.

Several important applications, such as streaming PCA and semidefinite programming, involve a large-scale positive-semidefinite (psd) matrix that is presented as a sequence of linear updates. Because of storage limitations, it may only be possible to retain a sketch of the psd matrix. This paper develops a new algorithm for fixed-rank psd approximation from a sketch. The approach combines the Nystrom approximation with a novel mechanism for rank truncation. Theoretical analysis establishes that the proposed method can achieve any prescribed relative error in the Schatten 1-norm and that it exploits the spectral decay of the input matrix. Computer experiments show that the proposed method dominates alternative techniques for fixed-rank psd matrix approximation across a wide range of examples.

25.3OCFeb 22, 2017

Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage

Alp Yurtsever, Madeleine Udell, Joel A. Tropp et al.

This paper concerns a fundamental class of convex matrix optimization problems. It presents the first algorithm that uses optimal storage and provably computes a low-rank approximation of a solution. In particular, when all solutions have low rank, the algorithm converges to a solution. This algorithm, SketchyCGM, modifies a standard convex optimization scheme, the conditional gradient method, to store only a small randomized sketch of the matrix variable. After the optimization terminates, the algorithm extracts a low-rank approximation of the solution from the sketch. In contrast to nonconvex heuristics, the guarantees for SketchyCGM do not rely on statistical models for the problem data. Numerical work demonstrates the benefits of SketchyCGM over heuristics.

20.0PRNov 30, 2015

Universality laws for randomized dimension reduction, with applications

Samet Oymak, Joel A. Tropp

Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. This dimension reduction procedure succeeds when it preserves certain geometric features of the set. The question is how large the embedding dimension must be to ensure that randomized dimension reduction succeeds with high probability. This paper studies a natural family of randomized dimension reduction maps and a large class of data sets. It proves that there is a phase transition in the success probability of the dimension reduction map as the embedding dimension increases. For a given data set, the location of the phase transition is the same for all maps in this family. Furthermore, each map has the same stability properties, as quantified through the restricted minimum singular value. These results can be viewed as new universality laws in high-dimensional stochastic geometry. Universality laws for randomized dimension reduction have many applications in applied mathematics, signal processing, and statistics. They yield design principles for numerical linear algebra algorithms, for compressed sensing measurement ensembles, and for random linear codes. Furthermore, these results have implications for the performance of statistical estimation methods under a large class of random experimental designs.

46.3PRJan 7, 2015

An Introduction to Matrix Concentration Inequalities

Joel A. Tropp

In recent years, random matrices have come to play a major role in computational mathematics, but most of the classical areas of random matrix theory remain the province of experts. Over the last decade, with the advent of matrix concentration inequalities, research has advanced to the point where we can conquer many (formerly) challenging problems with a page or two of arithmetic. The aim of this monograph is to describe the most successful methods from this area along with some interesting examples that these techniques can illuminate.

33.0OCJun 6, 2012

Factoring nonnegative matrices with linear programs

Victor Bittorf, Benjamin Recht, Christopher Re et al.

This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes.

27.6NADec 14, 2010

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.

2.3NANov 20, 2009

Error Bounds for Random Matrix Approximation Schemes

Alex Gittens, Joel A. Tropp

Randomized matrix sparsification has proven to be a fruitful technique for producing faster algorithms in applications ranging from graph partitioning to semidefinite programming. In the decade or so of research into this technique, the focus has been--with few exceptions--on ensuring the quality of approximation in the spectral and Frobenius norms. For certain graph algorithms, however, the $(\infty,1)$ norm may be a more natural measure of performance. This paper addresses the problem of approximating a real matrix A by a sparse random matrix X with respect to several norms. It provides the first results on approximation error in the $(\infty, 1)$ and $(\infty, 2)$ norms, and it uses a result of Latala to study approximation error in the spectral norm. These bounds hold for random sparsification schemes which ensure that the entries of X are independent and average to the corresponding entries of A. Optimality of the $(\infty, 1)$ and $(\infty,2)$ error estimates is established. Concentration results for the three norms hold when the entries of X are uniformly bounded. The spectral error bound is used to predict the performance of several sparsification and quantization schemes that have appeared in the literature; the results are competitive with the performance guarantees given by earlier scheme-specific analyses.