Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms
This work provides a theoretical framework for machine learning practitioners to design and analyze loss functions, but it is incremental as it builds upon existing concepts of regularization and entropies.
The paper tackles the problem of constructing convex loss functions for classifiers by introducing Fenchel-Young losses, which unify existing losses and enable creation of new ones based on generalized entropies, resulting in efficient algorithms and properties like separation margins and sparse probability distributions.
This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function. We analyze their properties in depth, showing that they unify many well-known loss functions and allow to create useful new ones easily. Fenchel-Young losses constructed from a generalized entropy, including the Shannon and Tsallis entropies, induce predictive probability distributions. We formulate conditions for a generalized entropy to yield losses with a separation margin, and probability distributions with sparse support. Finally, we derive efficient algorithms, making Fenchel-Young losses appealing both in theory and practice.