Tore Selland Kleppe

3papers

6citations

Novelty55%

AI Score22

Ranked #186,220 of 201,326 authors (top 92%)#67 in CO (top 97%)

3 Papers

MLAug 28, 2020

agtboost: Adaptive and Automatic Gradient Tree Boosting Computations

Berent Ånund Strømnes Lunde, Tore Selland Kleppe

agtboost is an R package implementing fast gradient tree boosting computations in a manner similar to other established frameworks such as xgboost and LightGBM, but with significant decreases in computation time and required mathematical and technical knowledge. The package automatically takes care of split/no-split decisions and selects the number of trees in the gradient tree boosting ensemble, i.e., agtboost adapts the complexity of the ensemble automatically to the information in the data. All of this is done during a single training run, which is made possible by utilizing developments in information theory for tree algorithms {\tt arXiv:2008.05926v1 [stat.ME]}. agtboost also comes with a feature importance function that eliminates the common practice of inserting noise features. Further, a useful model validation function performs the Kolmogorov-Smirnov test on the learned distribution.

MEAug 13, 2020

An information criterion for automatic gradient tree boosting

Berent Ånund Strømnes Lunde, Tore Selland Kleppe, Hans Julius Skaug

An information theoretic approach to learning the complexity of classification and regression trees and the number of trees in gradient tree boosting is proposed. The optimism (test loss minus training loss) of the greedy leaf splitting procedure is shown to be the maximum of a Cox-Ingersoll-Ross process, from which a generalization-error based information criterion is formed. The proposed procedure allows fast local model selection without cross validation based hyper parameter tuning, and hence efficient and automatic comparison among the large number of models performed during each boosting iteration. Relative to xgboost, speedups on numerical experiments ranges from around 10 to about 1400, at similar predictive-power measured in terms of test-loss.

COMay 4, 2020

Connecting the Dots: Numerical Randomized Hamiltonian Monte Carlo with State-Dependent Event Rates

Tore Selland Kleppe

Numerical Generalized Randomized Hamiltonian Monte Carlo is introduced, as a robust, easy to use and computationally fast alternative to conventional Markov chain Monte Carlo methods for continuous target distributions. A wide class of piecewise deterministic Markov processes generalizing Randomized HMC (Bou-Rabee and Sanz-Serna, 2017) by allowing for state-dependent event rates is defined. Under very mild restrictions, such processes will have the desired target distribution as an invariant distribution. Secondly, the numerical implementation of such processes, based on adaptive numerical integration of second order ordinary differential equations (ODEs) is considered. The numerical implementation yields an approximate, yet highly robust algorithm that, unlike conventional Hamiltonian Monte Carlo, enables the exploitation of the complete Hamiltonian trajectories (hence the title). The proposed algorithm may yield large speedups and improvements in stability relative to relevant benchmarks, while incurring numerical biases that are negligible relative to the overall Monte Carlo errors. Granted access to a high-quality ODE code, the proposed methodology is both easy to implement and use, even for highly challenging and high-dimensional target distributions.