On the Rates of Convergence from Surrogate Risk Minimizers to the Bayes Optimal Classifier
This provides theoretical guidance for selecting and modifying loss functions in machine learning, though it is incremental as it builds on existing convergence analysis.
The paper tackles the problem of comparing how quickly different surrogate loss functions converge to the optimal classifier, introducing 'consistency intensity' to characterize this and showing that hinge loss converges faster than logistic and exponential loss.
We study the rates of convergence from empirical surrogate risk minimizers to the Bayes optimal classifier. Specifically, we introduce the notion of \emph{consistency intensity} to characterize a surrogate loss function and exploit this notion to obtain the rate of convergence from an empirical surrogate risk minimizer to the Bayes optimal classifier, enabling fair comparisons of the excess risks of different surrogate risk minimizers. The main result of the paper has practical implications including (1) showing that hinge loss is superior to logistic and exponential loss in the sense that its empirical minimizer converges faster to the Bayes optimal classifier and (2) guiding to modify surrogate loss functions to accelerate the convergence to the Bayes optimal classifier.