Adaptive Neural Trees
This work addresses the challenge of unifying representation learning and hierarchical decision-making for machine learning practitioners, offering benefits like lightweight inference and adaptive architecture, though it is incremental in bridging two existing paradigms.
The paper tackles the problem of combining deep neural networks and decision trees by introducing adaptive neural trees (ANTs), which incorporate representation learning into tree components and adaptively grow the architecture, achieving competitive performance on classification and regression tasks.
Deep neural networks and decision trees operate on largely separate paradigms; typically, the former performs representation learning with pre-specified architectures, while the latter is characterised by learning hierarchies over pre-specified features with data-driven architectures. We unite the two via adaptive neural trees (ANTs) that incorporates representation learning into edges, routing functions and leaf nodes of a decision tree, along with a backpropagation-based training algorithm that adaptively grows the architecture from primitive modules (e.g., convolutional layers). We demonstrate that, whilst achieving competitive performance on classification and regression datasets, ANTs benefit from (i) lightweight inference via conditional computation, (ii) hierarchical separation of features useful to the task e.g. learning meaningful class associations, such as separating natural vs. man-made objects, and (iii) a mechanism to adapt the architecture to the size and complexity of the training dataset.