Splitting criteria for ordinal decision trees: an experimental study
This work addresses the suboptimal use of nominal methods in ordinal classification problems, which is critical for applications where error magnitude matters, though it is incremental as it compares existing criteria.
This paper tackles the problem of ordinal classification by comparing three ordinal splitting criteria (OGini, WIG, RI) against nominal ones in decision trees, finding that OGini reduces mean absolute error by over 3.02% compared to Gini across 45 datasets.
Ordinal Classification (OC) addresses those classification tasks where the labels exhibit a natural order. Unlike nominal classification, which treats all classes as mutually exclusive and unordered, OC takes the ordinal relationship into account, producing more accurate and relevant results. This is particularly critical in applications where the magnitude of classification errors has significant consequences. Despite this, OC problems are often tackled using nominal methods, leading to suboptimal solutions. Although decision trees are among the most popular classification approaches, ordinal tree-based approaches have received less attention when compared to other classifiers. This work provides a comprehensive survey of ordinal splitting criteria, standardising the notations used in the literature to enhance clarity and consistency. Three ordinal splitting criteria, Ordinal Gini (OGini), Weighted Information Gain (WIG), and Ranking Impurity (RI), are compared to the nominal counterparts of the first two (Gini and information gain), by incorporating them into a decision tree classifier. An extensive repository considering $45$ publicly available OC datasets is presented, supporting the first experimental comparison of ordinal and nominal splitting criteria using well-known OC evaluation metrics. The results have been statistically analysed, highlighting that OGini stands out as the best ordinal splitting criterion to date, reducing the mean absolute error achieved by Gini by more than 3.02%. To promote reproducibility, all source code developed, a detailed guide for reproducing the results, the 45 OC datasets, and the individual results for all the evaluated methodologies are provided.