Loss Minimization Yields Multicalibration for Large Neural Networks
This addresses fairness in machine learning by ensuring calibrated predictions for diverse groups, though it is incremental as it builds on prior multicalibration work.
The paper shows that minimizing squared loss for large neural networks implies multicalibration across protected groups represented by smaller networks, with a bounded number of exceptions, overcoming previous limitations that required predictors to be near ground truth.
Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal than loss minimization, even for simple predictors such as linear functions. In this work, we consider the setting where the protected groups can be represented by neural networks of size $k$, and the predictors are neural networks of size $n > k$. We show that minimizing the squared loss over all neural nets of size $n$ implies multicalibration for all but a bounded number of unlucky values of $n$. We also give evidence that our bound on the number of unlucky values is tight, given our proof technique. Previously, results of the flavor that loss minimization yields multicalibration were known only for predictors that were near the ground truth, hence were rather limited in applicability. Unlike these, our results rely on the expressivity of neural nets and utilize the representation of the predictor.