NA DS LGMar 7, 2018

Sketching for Principal Component Regression

arXiv:1803.02661v26.66 citations

Originality Incremental advance

AI Analysis

This work addresses computational bottlenecks in PCR for large-scale data analysis, offering incremental improvements in efficiency.

The paper tackles the high computational cost of principal component regression (PCR) for large-scale data by proposing efficient algorithms that provide high-quality approximations and rigorous risk bounds, achieving input sparsity time and demonstrating excellent empirical performance.

Principal component regression (PCR) is a useful method for regularizing linear regression. Although conceptually simple, straightforward implementations of PCR have high computational costs and so are inappropriate when learning with large scale data. In this paper, we propose efficient algorithms for computing approximate PCR solutions that are, on one hand, high quality approximations to the true PCR solutions (when viewed as minimizer of a constrained optimization problem), and on the other hand entertain rigorous risk bounds (when viewed as statistical estimators). In particular, we propose an input sparsity time algorithms for approximate PCR. We also consider computing an approximate PCR in the streaming model, and kernel PCR. Empirical results demonstrate the excellent performance of our proposed methods.

View on arXiv PDF

Similar