Vasily Ershov

31.6LGOct 24, 2018Code

CatBoost: gradient boosting with categorical features support

Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin

In this paper we present CatBoost, a new open-sourced gradient boosting library that successfully handles categorical features and outperforms existing publicly available implementations of gradient boosting in terms of quality on a set of popular publicly available datasets. The library has a GPU implementation of learning algorithm and a CPU implementation of scoring algorithm, which are significantly faster than other gradient boosting libraries on ensembles of similar sizes.

0.8LGOct 24, 2018

Why every GBDT speed benchmark is wrong

Anna Veronika Dorogush, Vasily Ershov, Dmitriy Kruchinin

This article provides a comprehensive study of different ways to make speed benchmarks of gradient boosted decision trees algorithm. We show main problems of several straight forward ways to make benchmarks, explain, why a speed benchmarking is a challenging task and provide a set of reasonable requirements for a benchmark to be fair and useful.

Vasily Ershov

2 Papers