MLJan 25, 2018

Information gain ratio correction: Improving prediction with more balanced decision tree splits

arXiv:1801.08310v18 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a specific issue in decision tree algorithms for machine learning practitioners, but it is incremental as it builds on existing methods.

The paper tackles the bias in decision tree split selection by proposing an updated gain ratio correction to improve predictive accuracy, showing better performance than the original gain ratio.

Decision trees algorithms use a gain function to select the best split during the tree's induction. This function is crucial to obtain trees with high predictive accuracy. Some gain functions can suffer from a bias when it compares splits of different arities. Quinlan proposed a gain ratio in C4.5's information gain function to fix this bias. In this paper, we present an updated version of the gain ratio that performs better as it tries to fix the gain ratio's bias for unbalanced trees and some splits with low predictive interest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes