Relational Boosted Regression Trees
This work addresses efficiency challenges for machine learning practitioners working with relational data, though it appears incremental as it adapts existing methods rather than introducing fundamentally new paradigms.
The paper tackles the problem of training boosted regression trees on relational databases by providing a relational adaptation of the greedy algorithm, achieving asymptotically better runtime through a (1 + ε)-approximation for calculating sum of squared residuals using tensor sketching.
Many tasks use data housed in relational databases to train boosted regression tree models. In this paper, we give a relational adaptation of the greedy algorithm for training boosted regression trees. For the subproblem of calculating the sum of squared residuals of the dataset, which dominates the runtime of the boosting algorithm, we provide a $(1 + ε)$-approximation using the tensor sketch technique. Employing this approximation within the relational boosted regression trees algorithm leads to learning similar model parameters, but with asymptotically better runtime.