A Concentration of Measure Framework to study convex problems and other implicit formulation problems in machine learning
This provides a theoretical foundation for understanding the behavior and generalization error of various machine learning algorithms like logistic regression and lasso, which are defined through implicit equations.
The paper tackles the problem of analyzing solutions to convex optimization problems in machine learning by developing a concentration of measure framework, showing that solutions concentrate when the objective depends on random data satisfying concentration hypotheses, and providing precise moment estimates for solutions to characterize algorithm performance.
This paper provides a framework to show the concentration of solutions $Y^*$ to convex minimizing problem where the objective function $φ(X)(Y)$ depends on some random vector $X$ satisfying concentration of measure hypotheses. More precisely, the convex problem translates into a contractive fixed point equation that ensure the transmission of the concentration from $X$ to $Y^*$. This result is of central interest to characterize many machine learning algorithms which are defined through implicit equations (e.g., logistic regression, lasso, boosting, etc.). Based on our framework, we provide precise estimations for the first moments of the solution $Y^*$, when $X= (x_1,\ldots, x_n)$ is a data matrix of independent columns and $φ(X)(y)$ writes as a sum $\frac{1}{n}\sum_{i=1}^n h_i(x_i^TY)$. That allows to describe the behavior and performance (e.g., generalization error) of a wide variety of machine learning classifiers.