LGSep 29, 2021

Efficient Reinforced Feature Selection via Early Stopping Traverse Strategy

arXiv:2109.14180v225 citations
Originality Incremental advance
AI Analysis

This addresses the computational bottleneck in reinforced feature selection for data preprocessing, making it more practical for real-world applications, though it's an incremental improvement over existing methods.

The paper tackles the computational inefficiency of multi-agent reinforced feature selection methods by proposing a single-agent Monte Carlo approach with early stopping and reward-level interactive strategies, achieving significant speed improvements (up to 10x faster) while maintaining competitive accuracy on real-world datasets.

In this paper, we propose a single-agent Monte Carlo based reinforced feature selection (MCRFS) method, as well as two efficiency improvement strategies, i.e., early stopping (ES) strategy and reward-level interactive (RI) strategy. Feature selection is one of the most important technologies in data prepossessing, aiming to find the optimal feature subset for a given downstream machine learning task. Enormous research has been done to improve its effectiveness and efficiency. Recently, the multi-agent reinforced feature selection (MARFS) has achieved great success in improving the performance of feature selection. However, MARFS suffers from the heavy burden of computational cost, which greatly limits its application in real-world scenarios. In this paper, we propose an efficient reinforcement feature selection method, which uses one agent to traverse the whole feature set, and decides to select or not select each feature one by one. Specifically, we first develop one behavior policy and use it to traverse the feature set and generate training data. And then, we evaluate the target policy based on the training data and improve the target policy by Bellman equation. Besides, we conduct the importance sampling in an incremental way, and propose an early stopping strategy to improve the training efficiency by the removal of skew data. In the early stopping strategy, the behavior policy stops traversing with a probability inversely proportional to the importance sampling weight. In addition, we propose a reward-level interactive strategy to improve the training efficiency via reward-level external advice. Finally, we design extensive experiments on real-world data to demonstrate the superiority of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes