LGMay 23

Trajectory-Based Difficulty Scoring for Reliable Learning on Tabular Data

Tomer Lavi, Bracha Shapira, Nadav Rappoport

arXiv:2605.2468017.3

AI Analysis

For practitioners using gradient-boosted trees on tabular data, TDS provides a reliable difficulty signal that enhances label efficiency, risk control, and error analysis.

The paper introduces a Trajectory-based Difficulty Score (TDS) for gradient-boosted trees that estimates instance-level difficulty from per-tree cumulative prediction trajectories. TDS achieves strong rank correlation with error, outperforms established baselines on classification, and improves active learning, selective prediction, and conformal prediction workflows.

Gradient-boosted trees achieve strong performance on tabular data, yet often leave a long tail of poorly predicted instances. We introduce a Trajectory-based Difficulty Score (TDS), an instance-level difficulty estimator for boosted ensembles derived from per-tree cumulative prediction trajectories. For each instance, we compute interpretable trajectory descriptors (e.g., variance, oscillation peaks, sign switches, and tail stability) and train a lightweight regression model to predict held-out loss. An empirical CDF calibrates the resulting signal into a score in $[0,1]$ that supports ranking hard cases. Across diverse tabular benchmarks and ensemble sizes, TDS exhibits strong rank correlation with error and outperforms established instance-hardness and uncertainty baselines on classification, while remaining competitive on regression. We then show how a single difficulty signal improves multiple data mining workflows: difficulty-driven active learning for label-efficient training, difficulty-thresholded selective prediction for improved risk-coverage trade-offs, and TDS-stratified (Mondrian) conformal prediction for more uniform conditional coverage. Finally, clustering high-TDS instances using SHAP attributions reveals coherent failure modes characterized by compact feature-value ranges, supporting error analysis and targeted data acquisition.

View on arXiv PDF

Similar