LGMLJun 20, 2019

ID3 Learns Juntas for Smoothed Product Distributions

arXiv:1906.08654v122 citations
Originality Incremental advance
AI Analysis

This provides theoretical justification for a widely used heuristic in machine learning, addressing a gap in understanding for practitioners and researchers.

The paper tackles the problem of theoretically analyzing the ID3 algorithm for learning decision trees, specifically when the target function is a k-junta, and proves that ID3 learns such functions in polynomial time under smoothed analysis with k = log n.

In recent years, there are many attempts to understand popular heuristics. An example of such a heuristic algorithm is the ID3 algorithm for learning decision trees. This algorithm is commonly used in practice, but there are very few theoretical works studying its behavior. In this paper, we analyze the ID3 algorithm, when the target function is a $k$-Junta, a function that depends on $k$ out of $n$ variables of the input. We prove that when $k = \log n$, the ID3 algorithm learns in polynomial time $k$-Juntas, in the smoothed analysis model of Kalai & Teng. That is, we show a learnability result when the observed distribution is a "noisy" variant of the original distribution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes