Supervised Score-Based Modeling by Gradient Boosting
This work addresses the problem of slow inference and performance issues in probabilistic supervised learning models for researchers and practitioners, though it appears incremental as it builds on existing gradient boosting and score-based methods.
The paper tackles the limitations of existing score-based generative models in supervised learning by proposing a Supervised Score-based Model (SSM) that combines score matching with gradient boosting, resulting in improved accuracy and reduced inference time compared to other models.
Score-based generative models can effectively learn the distribution of data by estimating the gradient of the distribution. Due to the multi-step denoising characteristic, researchers have recently considered combining score-based generative models with the gradient boosting algorithm, a multi-step supervised learning algorithm, to solve supervised learning tasks. However, existing generative model algorithms are often limited by the stochastic nature of the models and the long inference time, impacting prediction performances. Therefore, we propose a Supervised Score-based Model (SSM), which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Via the ablation experiment in selected examples, we demonstrate the outstanding performances of the proposed techniques. Additionally, we compare our model with other probabilistic models, including Natural Gradient Boosting (NGboost), Classification and Regression Diffusion Models (CARD), Diffusion Boosted Trees (DBT), and non-probabilistic GBM models. The experimental results show that our model outperforms existing models in both accuracy and inference time.