Ranked Set Sampling-Based Multilayer Perceptron: Improving Generalization via Variance-Based Bounds
This work addresses generalization issues in neural networks for researchers and practitioners, but it is incremental as it builds on existing variance reduction techniques like bagging.
The paper tackles the problem of improving generalization in multilayer perceptrons by establishing a new variance-based generalization error bound and proposing a Ranked Set Sampling (RSS) method to reduce empirical loss variance, showing that RSS-MLP outperforms bagging with Simple Random Sampling on twelve benchmark datasets.
Multilayer perceptron (MLP), one of the most fundamental neural networks, is extensively utilized for classification and regression tasks. In this paper, we establish a new generalization error bound, which reveals how the variance of empirical loss influences the generalization ability of the learning model. Inspired by this learning bound, we advocate to reduce the variance of empirical loss to enhance the ability of MLP. As is well-known, bagging is a popular ensemble method to realize variance reduction. However, bagging produces the base training data sets by the Simple Random Sampling (SRS) method, which exhibits a high degree of randomness. To handle this issue, we introduce an ordered structure in the training data set by Rank Set Sampling (RSS) to further reduce the variance of loss and develop a RSS-MLP method. Theoretical results show that the variance of empirical exponential loss and the logistic loss estimated by RSS are smaller than those estimated by SRS, respectively. To validate the performance of RSS-MLP, we conduct comparison experiments on twelve benchmark data sets in terms of the two convex loss functions under two fusion methods. Extensive experimental results and analysis illustrate the effectiveness and rationality of the propose method.