Neural Network Classifier as Mutual Information Evaluator
This work addresses classification accuracy issues for imbalanced datasets in machine learning, offering a novel theoretical insight with practical improvements.
The paper reinterprets neural network classifiers with softmax and cross-entropy as mutual information evaluators, showing that training maximizes input-label mutual information for balanced datasets, and introduces a new softmax form that improves classification accuracy, especially for imbalanced datasets, with experimental results demonstrating better accuracy.
Cross-entropy loss with softmax output is a standard choice to train neural network classifiers. We give a new view of neural network classifiers with softmax and cross-entropy as mutual information evaluators. We show that when the dataset is balanced, training a neural network with cross-entropy maximises the mutual information between inputs and labels through a variational form of mutual information. Thereby, we develop a new form of softmax that also converts a classifier to a mutual information evaluator when the dataset is imbalanced. Experimental results show that the new form leads to better classification accuracy, in particular for imbalanced datasets.