STITLGMLJul 9, 2023

On the sample complexity of parameter estimation in logistic regression with normal design

arXiv:2307.04191v411 citationsh-index: 24
Originality Synthesis-oriented
AI Analysis

This work addresses the non-asymptotic sample complexity for parameter estimation in logistic regression, an incremental contribution to understanding estimation limits in noisy binary classification.

The paper studies the sample complexity of parameter estimation in logistic regression with normal design, showing that the required number of samples depends on dimension and inverse temperature, with the curve exhibiting two change-points that separate low, moderate, and high temperature regimes.

The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given $\ell_2$ error, in terms of the dimension and the inverse temperature, with standard normal covariates. The inverse temperature controls the signal-to-noise ratio of the data generation process. While both generalization bounds and asymptotic performance of the maximum-likelihood estimator for logistic regression are well-studied, the non-asymptotic sample complexity that shows the dependence on error and the inverse temperature for parameter estimation is absent from previous analyses. We show that the sample complexity curve has two change-points in terms of the inverse temperature, clearly separating the low, moderate, and high temperature regimes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes