Single-Solution Hypervolume Maximization and its use for Improving Generalization of Neural Networks
This addresses generalization issues in neural networks for machine learning practitioners, but it is incremental as it builds on existing loss optimization methods.
The paper tackles the problem of improving neural network generalization by introducing single-solution hypervolume maximization as an alternative to mean loss minimization, showing a 20% reduction in classification error on the MNIST test set.
This paper introduces the hypervolume maximization with a single solution as an alternative to the mean loss minimization. The relationship between the two problems is proved through bounds on the cost function when an optimal solution to one of the problems is evaluated on the other, with a hyperparameter to control the similarity between the two problems. This same hyperparameter allows higher weight to be placed on samples with higher loss when computing the hypervolume's gradient, whose normalized version can range from the mean loss to the max loss. An experiment on MNIST with a neural network is used to validate the theory developed, showing that the hypervolume maximization can behave similarly to the mean loss minimization and can also provide better performance, resulting on a 20% reduction of the classification error on the test set.