LGAINov 11, 2025

Binary Split Categorical feature with Mean Absolute Error Criteria in CART

arXiv:2511.08470v1h-index: 52
Originality Incremental advance
AI Analysis

This work addresses a specific technical bottleneck in CART algorithms for handling categorical data, making it incremental rather than broadly impactful.

The paper tackled the problem of efficiently splitting categorical features in CART using the Mean Absolute Error (MAE) criterion, showing that unsupervised numerical encoding methods are not viable and presenting a novel algorithm to address this, though no concrete numbers are provided.

In the context of the Classification and Regression Trees (CART) algorithm, the efficient splitting of categorical features using standard criteria like GINI and Entropy is well-established. However, using the Mean Absolute Error (MAE) criterion for categorical features has traditionally relied on various numerical encoding methods. This paper demonstrates that unsupervised numerical encoding methods are not viable for the MAE criteria. Furthermore, we present a novel and efficient splitting algorithm that addresses the challenges of handling categorical features with the MAE criterion. Our findings underscore the limitations of existing approaches and offer a promising solution to enhance the handling of categorical data in CART algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes