ML LGJul 18, 2022

pGMM Kernel Regression and Comparisons with Boosted Trees

arXiv:2207.08667v15.33 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This work provides incremental improvements for practitioners in machine learning by offering flexible tuning options in kernel regression and boosting algorithms.

The paper tackles the problem of improving regression performance by evaluating the pGMM kernel in ridge regression and comparing it with boosted trees, finding that the pGMM kernel performs well without tuning and comparably to boosted trees when tuned, and that L_p boost with p>2 often yields the best L_2 loss.

In this work, we demonstrate the advantage of the pGMM (``powered generalized min-max'') kernel in the context of (ridge) regression. In recent prior studies, the pGMM kernel has been extensively evaluated for classification tasks, for logistic regression, support vector machines, as well as deep neural networks. In this paper, we provide an experimental study on ridge regression, to compare the pGMM kernel regression with the ordinary ridge linear regression as well as the RBF kernel ridge regression. Perhaps surprisingly, even without a tuning parameter (i.e., $p=1$ for the power parameter of the pGMM kernel), the pGMM kernel already performs well. Furthermore, by tuning the parameter $p$, this (deceptively simple) pGMM kernel even performs quite comparably to boosted trees. Boosting and boosted trees are very popular in machine learning practice. For regression tasks, typically, practitioners use $L_2$ boost, i.e., for minimizing the $L_2$ loss. Sometimes for the purpose of robustness, the $L_1$ boost might be a choice. In this study, we implement $L_p$ boost for $p\geq 1$ and include it in the package of ``Fast ABC-Boost''. Perhaps also surprisingly, the best performance (in terms of $L_2$ regression loss) is often attained at $p>2$, in some cases at $p\gg 2$. This phenomenon has already been demonstrated by Li et al (UAI 2010) in the context of k-nearest neighbor classification using $L_p$ distances. In summary, the implementation of $L_p$ boost provides practitioners the additional flexibility of tuning boosting algorithms for potentially achieving better accuracy in regression applications.

View on arXiv PDF Code

Similar