On Structured Prediction Theory with Calibrated Convex Surrogate Losses
This work provides theoretical insights for machine learning researchers working on structured prediction, offering a framework to design efficient convex surrogates with proven guarantees, though it is incremental in building on prior theory.
The paper tackles the problem of structured prediction by constructing convex surrogate losses with consistency guarantees and analyzing the effect of the exponential number of classes on learning and optimization. It shows that some task losses make learning harder, with the classical 0-1 loss being ill-suited for structured prediction.
We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees. For any task loss, we construct a convex surrogate that can be optimized via stochastic gradient descent and we prove tight bounds on the so-called "calibration function" relating the excess surrogate risk to the actual risk. In contrast to prior related work, we carefully monitor the effect of the exponential number of classes in the learning guarantees as well as on the optimization complexity. As an interesting consequence, we formalize the intuition that some task losses make learning harder than others, and that the classical 0-1 loss is ill-suited for general structured prediction.