CC DS LGSep 19, 2024

Fast decision tree learning solves hard coding-theoretic problems

Caleb Koch, Carmen Strassle, Li-Yang Tan

arXiv:2409.13096v22.31 citationsh-index: 3

Originality Highly original

AI Analysis

This work addresses fundamental bottlenecks in computational learning theory and coding theory by establishing a novel connection between two long-standing open problems, potentially guiding future algorithmic research or proving optimality of existing methods.

The paper tackles the problem of properly PAC learning decision trees and connects it to the parameterized Nearest Codeword Problem (k-NCP), showing that any improvement on the quasipolynomial-time algorithm for decision trees would yield an O(log n)-approximation for k-NCP, an exponential improvement over the current O(n/log n) ratio, and rules out polynomial-time algorithms for decision tree learning even in weak learning settings.

We connect the problem of properly PAC learning decision trees to the parameterized Nearest Codeword Problem ($k$-NCP). Despite significant effort by the respective communities, algorithmic progress on both problems has been stuck: the fastest known algorithm for the former runs in quasipolynomial time (Ehrenfeucht and Haussler 1989) and the best known approximation ratio for the latter is $O(n/\log n)$ (Berman and Karpinsky 2002; Alon, Panigrahy, and Yekhanin 2009). Research on both problems has thus far proceeded independently with no known connections. We show that $\textit{any}$ improvement of Ehrenfeucht and Haussler's algorithm will yield $O(\log n)$-approximation algorithms for $k$-NCP, an exponential improvement of the current state of the art. This can be interpreted either as a new avenue for designing algorithms for $k$-NCP, or as one for establishing the optimality of Ehrenfeucht and Haussler's algorithm. Furthermore, our reduction along with existing inapproximability results for $k$-NCP already rule out polynomial-time algorithms for properly learning decision trees. A notable aspect of our hardness results is that they hold even in the setting of $\textit{weak}$ learning whereas prior ones were limited to the setting of strong learning.

View on arXiv PDF

Similar