LGApr 3

Efficient Logistic Regression with Mixture of Sigmoids

Federico Di Gennaro, Saptarshi Chakraborty, Nikita Zhivotovskiy

arXiv:2604.029207.7h-index: 3

Predicted impact top 81% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses computational bottlenecks in online learning algorithms for researchers and practitioners, offering incremental improvements in efficiency and geometric insights.

This paper tackles the computational inefficiency of the Exponential Weights algorithm for online logistic regression, achieving a worst-case computational complexity of O(B^3 n^5), which substantially improves on the prior O(B^18 n^37) complexity while maintaining the same near-optimal regret bound. It also shows that under linear separability, the algorithm's predictor converges to a solid-angle vote over separating directions, with non-asymptotic regret bounds that become independent of B and grow only logarithmically with the inverse margin once B exceeds a threshold.

This paper studies the Exponential Weights (EW) algorithm with an isotropic Gaussian prior for online logistic regression. We show that the near-optimal worst-case regret bound $O(d\log(Bn))$ for EW, established by Kakade and Ng (2005) against the best linear predictor of norm at most $B$, can be achieved with total worst-case computational complexity $O(B^3 n^5)$. This substantially improves on the $O(B^{18}n^{37})$ complexity of prior work achieving the same guarantee (Foster et al., 2018). Beyond efficiency, we analyze the large-$B$ regime under linear separability: after rescaling by $B$, the EW posterior converges as $B\to\infty$ to a standard Gaussian truncated to the version cone. Accordingly, the predictor converges to a solid-angle vote over separating directions and, on every fixed-margin slice of this cone, the mode of the corresponding truncated Gaussian is aligned with the hard-margin SVM direction. Using this geometry, we derive non-asymptotic regret bounds showing that once $B$ exceeds a margin-dependent threshold, the regret becomes independent of $B$ and grows only logarithmically with the inverse margin. Overall, our results show that EW can be both computationally tractable and geometrically adaptive in online classification.

View on arXiv PDF

Similar