Loss Functions for Classification using Structured Entropy
This work addresses a limitation in standard classification loss functions for domains where target variables have inherent structure, but it appears incremental as it builds on existing entropy concepts.
The authors tackled the problem of cross-entropy loss ignoring target similarities in classification by proposing structured entropy, a generalization that incorporates target structure while preserving theoretical properties. They demonstrated that structured cross-entropy loss improves performance on classification tasks with known target structure, though no concrete numbers were provided.
Cross-entropy loss is the standard metric used to train classification models in deep learning and gradient boosting. It is well-known that this loss function fails to account for similarities between the different values of the target. We propose a generalization of entropy called {\em structured entropy} which uses a random partition to incorporate the structure of the target variable in a manner which retains many theoretical properties of standard entropy. We show that a structured cross-entropy loss yields better results on several classification problems where the target variable has an a priori known structure. The approach is simple, flexible, easily computable, and does not rely on a hierarchically defined notion of structure.