RLBoost: Boosting Supervised Models using Deep Reinforcement Learning
This work addresses data quality issues in supervised learning, offering a domain-specific solution that is incremental as it builds on existing reinforcement learning and attention mechanisms.
The paper tackles the problem of data quality evaluation for supervised learning by introducing RLBoost, an algorithm that uses deep reinforcement learning to assess dataset quality and filter out dubious data, resulting in better and more stable predictive performance compared to state-of-the-art methods like LOO, DataShapley, and DVRL.
Data quality or data evaluation is sometimes a task as important as collecting a large volume of data when it comes to generating accurate artificial intelligence models. In fact, being able to evaluate the data can lead to a larger database that is better suited to a particular problem because we have the ability to filter out data obtained automatically of dubious quality. In this paper we present RLBoost, an algorithm that uses deep reinforcement learning strategies to evaluate a particular dataset and obtain a model capable of estimating the quality of any new data in order to improve the final predictive quality of a supervised learning model. This solution has the advantage that of being agnostic regarding the supervised model used and, through multi-attention strategies, takes into account the data in its context and not only individually. The results of the article show that this model obtains better and more stable results than other state-of-the-art algorithms such as LOO, DataShapley or DVRL.