LGMLSep 30, 2019

Oblique Decision Trees from Derivatives of ReLU Networks

arXiv:1909.13488v226 citations
AI Analysis

This work addresses the challenge of training oblique decision trees more efficiently for applications like molecular property prediction, though it is incremental as it builds on existing ReLU network concepts.

The paper tackles the problem of realizing decision trees using neural networks by proposing locally constant networks, which are equivalent to oblique decision trees and allow parameter sharing, requiring only M neurons for 2^M leaf nodes. The result shows that this method outperforms alternative techniques in molecular property classification and regression tasks.

We show how neural models can be used to realize piece-wise constant functions such as decision trees. The proposed architecture, which we call locally constant networks, builds on ReLU networks that are piece-wise linear and hence their associated gradients with respect to the inputs are locally constant. We formally establish the equivalence between the classes of locally constant networks and decision trees. Moreover, we highlight several advantageous properties of locally constant networks, including how they realize decision trees with parameter sharing across branching / leaves. Indeed, only $M$ neurons suffice to implicitly model an oblique decision tree with $2^M$ leaf nodes. The neural representation also enables us to adopt many tools developed for deep networks (e.g., DropConnect (Wan et al., 2013)) while implicitly training decision trees. We demonstrate that our method outperforms alternative techniques for training oblique decision trees in the context of molecular property classification and regression tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes