MLAILGAPApr 15, 2020

Exploiting Categorical Structure Using Tree-Based Methods

arXiv:2004.07383v17 citations
AI Analysis

This work addresses a domain-specific issue in machine learning for handling categorical data, offering an incremental improvement over existing methods.

The paper tackled the problem of representing categorical variables with complex structures beyond linear ordering, developing a mathematical framework and generalizing decision trees to exploit this structure, showing improvements on weather data.

Standard methods of using categorical variables as predictors either endow them with an ordinal structure or assume they have no structure at all. However, categorical variables often possess structure that is more complicated than a linear ordering can capture. We develop a mathematical framework for representing the structure of categorical variables and show how to generalize decision trees to make use of this structure. This approach is applicable to methods such as Gradient Boosted Trees which use a decision tree as the underlying learner. We show results on weather data to demonstrate the improvement yielded by this approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes