ML AI LG STFeb 1, 2021

Fast rates in structured prediction

Vivien Cabannes, Alessandro Rudi, Francis Bach

arXiv:2102.00760v311.019 citations

Originality Highly original

AI Analysis

This work addresses the challenge of slow convergence in discrete supervised learning for researchers and practitioners in machine learning, offering a novel approach to accelerate rates in structured prediction, though it is incremental in building on existing surrogate methods.

The paper tackles the problem of slow convergence rates in structured prediction by leveraging the discrete nature of outputs, achieving 'super fast' rates faster than n^{-1}, including exponential rates under strong assumptions, and demonstrates this with nearest neighbors and kernel ridge regression, improving rates from n^{-1/4} to arbitrarily fast depending on problem hardness.

Discrete supervised learning problems such as classification are often tackled by introducing a continuous surrogate problem akin to regression. Bounding the original error, between estimate and solution, by the surrogate error endows discrete problems with convergence rates already shown for continuous instances. Yet, current approaches do not leverage the fact that discrete problems are essentially predicting a discrete output when continuous problems are predicting a continuous value. In this paper, we tackle this issue for general structured prediction problems, opening the way to "super fast" rates, that is, convergence rates for the excess risk faster than $n^{-1}$, where $n$ is the number of observations, with even exponential rates with the strongest assumptions. We first illustrate it for predictors based on nearest neighbors, generalizing rates known for binary classification to any discrete problem within the framework of structured prediction. We then consider kernel ridge regression where we improve known rates in $n^{-1/4}$ to arbitrarily fast rates, depending on a parameter characterizing the hardness of the problem, thus allowing, under smoothness assumptions, to bypass the curse of dimensionality.

View on arXiv PDF

Similar