The Statistics of Streaming Sparse Regression
This addresses the need for efficient sparse regression in streaming data scenarios, offering a method with statistical guarantees and computational efficiency, though it appears incremental as it builds on stochastic gradient descent and lasso concepts.
The paper tackles the problem of streaming sparse regression by proposing a sparse analogue to stochastic gradient descent that recovers the support set with high probability and achieves a quasi-optimal convergence rate of Op(k log(d)/T). It outperforms existing streaming algorithms in experiments on real and simulated data.
We present a sparse analogue to stochastic gradient descent that is guaranteed to perform well under similar conditions to the lasso. In the linear regression setup with irrepresentable noise features, our algorithm recovers the support set of the optimal parameter vector with high probability, and achieves a statistically quasi-optimal rate of convergence of Op(k log(d)/T), where k is the sparsity of the solution, d is the number of features, and T is the number of training examples. Meanwhile, our algorithm does not require any more computational resources than stochastic gradient descent. In our experiments, we find that our method substantially out-performs existing streaming algorithms on both real and simulated data.