A termination criterion for stochastic gradient descent for binary classification
This provides a practical solution for machine learning practitioners using SGD in binary classification, though it is incremental as it builds on existing SGD methods with specific assumptions.
The authors tackled the problem of determining when to stop stochastic gradient descent for binary classification, proposing a simple termination test that works with constant step-size SGD on logistic and hinge losses, and showed it terminates in finite iterations and nearly minimizes misclassification probability under certain noise conditions.
We propose a new, simple, and computationally inexpensive termination test for constant step-size stochastic gradient descent (SGD) applied to binary classification on the logistic and hinge loss with homogeneous linear predictors. Our theoretical results support the effectiveness of our stopping criterion when the data is Gaussian distributed. This presence of noise allows for the possibility of non-separable data. We show that our test terminates in a finite number of iterations and when the noise in the data is not too large, the expected classifier at termination nearly minimizes the probability of misclassification. Finally, numerical experiments indicate for both real and synthetic data sets that our termination test exhibits a good degree of predictability on accuracy and running time.