LG DS MLMay 8, 2021

Learning stochastic decision trees

arXiv:2105.03594v14.43 citations

Originality Highly original

AI Analysis

This solves a fundamental problem in machine learning for robust decision tree learning, providing the first non-trivial algorithm with optimal noise resilience, which is incremental in improving over prior methods.

The paper tackles the problem of learning stochastic decision trees from noisy data, presenting a quasipolynomial-time algorithm that achieves error within an additive 2η + ε of the Bayes optimal, where 2η is the information-theoretic minimum, and it runs in time n^{O(log(s/ε)/ε^2)}.

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an $η$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm runs in time $n^{O(\log(s/\varepsilon)/\varepsilon^2)}$ and returns a hypothesis with error within an additive $2η+ \varepsilon$ of the Bayes optimal. An additive $2η$ is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of $O(η) + \varepsilon$ was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

View on arXiv PDF

Similar