Feature Space Sketching for Logistic Regression
This work provides theoretical foundations for efficient logistic regression in data analysis, though it appears incremental as it builds on prior coreset and sketching research.
The paper tackles the problem of efficiently constructing coresets and performing feature selection/dimensionality reduction for logistic regression by developing novel theoretical bounds for sketching methods. The results include tight constant-factor bounds for coreset complexity and forward error bounds that extend to Generalized Linear Models.
We present novel bounds for coreset construction, feature selection, and dimensionality reduction for logistic regression. All three approaches can be thought of as sketching the logistic regression inputs. On the coreset construction front, we resolve open problems from prior work and present novel bounds for the complexity of coreset construction methods. On the feature selection and dimensionality reduction front, we initiate the study of forward error bounds for logistic regression. Our bounds are tight up to constant factors and our forward error bounds can be extended to Generalized Linear Models.