Inducing a hierarchy for multi-class classification problems
This work addresses a common limitation in classification tasks where hierarchical structures are absent, offering a practical solution for domains requiring improved accuracy through label organization.
The paper tackles the problem of multi-class classification when datasets lack a predefined hierarchical label structure by proposing a method to induce a hierarchy, which improves classification performance over flat classifiers in simulations and real data applications.
In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. Un-fortunately, the majority of classification datasets do not come pre-equipped with a hierarchical structure and classical flat classifiers must be employed. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. The class of methods follows the structure of first clustering the conditional distributions and subsequently using a hierarchical classifier with the induced hierarchy. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications.