Learning Optimal Linear Regularizers
This addresses the challenge of tuning regularization hyperparameters for machine learning practitioners, though it appears incremental as it builds on existing regularization frameworks.
The paper tackles the problem of learning optimal linear regularizers to improve generalization by viewing regularizers as upper bounds on the generalization gap and reducing slack in the bound. It presents an algorithm based on linear programming that efficiently computes optimal hyperparameters, showing effectiveness on real and synthetic data.
We present algorithms for efficiently learning regularizers that improve generalization. Our approach is based on the insight that regularizers can be viewed as upper bounds on the generalization gap, and that reducing the slack in the bound can improve performance on test data. For a broad class of regularizers, the hyperparameters that give the best upper bound can be computed using linear programming. Under certain Bayesian assumptions, solving the LP lets us "jump" to the optimal hyperparameters given very limited data. This suggests a natural algorithm for tuning regularization hyperparameters, which we show to be effective on both real and synthetic data.