LGAIMay 8

Approximation-Free Differentiable Oblique Decision Trees

arXiv:2605.0783719.8
Predicted impact top 81% in LG · last 90 daysOriginality Highly original
AI Analysis

For practitioners needing interpretable yet accurate models, DTSemNet provides a principled way to train oblique DTs without approximation errors, improving performance over existing differentiable DT methods.

DTSemNet introduces a semantically equivalent and invertible neural network representation of hard oblique decision trees, enabling approximation-free gradient-based training. It outperforms state-of-the-art differentiable DTs on classification and regression benchmarks and extends to reinforcement learning policies.

Decision Trees (DTs) are widely used in safety-critical domains such as medical diagnosis, valued for their interpretability and effectiveness on tabular data. However, training accurate oblique DTs is challenging due to complex optimization landscapes and overfitting risks, particularly in regression. Recent advances have introduced differentiable formulations that enable gradient-based training and joint optimization of decision boundaries and leaf regressors. Yet, existing approaches typically rely on approximations, either through probabilistic softening of boundaries (soft DTs) or quantized gradients such as the Straight-Through Estimator (STE). To overcome these limitations, we propose DTSemNet, a novel, semantically equivalent, and invertible representation of hard oblique DTs as neural networks. DTSemNet enables end-to-end training with standard gradient descent, eliminating the need for approximations in both classification and regression. While classification aligns naturally with this formulation, regression remains challenging due to the joint optimization of internal nodes and leaf regressors. To address this, we analyze the limitations of STE and introduce an annealed Top-k method that provides accurate gradient signals without approximation. Extensive experiments on classification and regression benchmarks show that DTSemNet-trained oblique DTs outperform state-of-the-art differentiable DTs. Furthermore, we demonstrate that DTSemNet can serve as programmatic DT policies in reinforcement learning environments, thereby broadening their applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes