LG AIAug 31, 2021

Chi-square Loss for Softmax: an Echo of Neural Network Structure

arXiv:2108.13822v11.6

Originality Incremental advance

AI Analysis

This is an incremental improvement for neural network classification tasks, offering a new loss function with statistical grounding and structural insights.

The authors tackled the problem of classification by proposing a chi-square loss function as an alternative to cross-entropy for softmax, proving it is unbiased and works with label smoothing, and found it reveals neural network structure influences through visualization, though performance degrades with many classes.

Softmax working with cross-entropy is widely used in classification, which evaluates the similarity between two discrete distribution columns (predictions and true labels). Inspired by chi-square test, we designed a new loss function called chi-square loss, which is also works for Softmax. Chi-square loss has a statistical background. We proved that it is unbiased in optimization, and clarified its using conditions (its formula determines that it must work with label smoothing). In addition, we studied the sample distribution of this loss function by visualization and found that the distribution is related to the neural network structure, which is distinct compared to cross-entropy. In the past, the influence of structure was often ignored when visualizing. Chi-square loss can notice changes in neural network structure because it is very strict, and we explained the reason for this strictness. We also studied the influence of label smoothing and discussed the relationship between label smoothing and training accuracy and stability. Since the chi-square loss is very strict, the performance will degrade when dealing samples of very many classes.

View on arXiv PDF

Similar