Causal Feature Selection Method for Contextual Multi-Armed Bandits in Recommender System
This addresses feature selection for heterogeneous treatment effects in recommender systems, offering an incremental improvement over traditional correlation-based methods.
The paper tackled the problem of suboptimal feature selection degrading performance in contextual multi-armed bandits for recommender systems by introducing two model-free filter methods, HIE and HDD, which improved CMAB performance in synthetic and large-scale commercial tests.
Effective feature selection is essential for optimizing contextual multi-armed bandits (CMABs) in large-scale online systems, where suboptimal features can degrade rewards, interpretability, and efficiency. Traditional feature selection often prioritizes outcome correlation, neglecting the crucial role of heterogeneous treatment effects (HTE) across arms in CMAB decision-making. This paper introduces two novel, model-free filter methods, Heterogeneous Incremental Effect (HIE) and Heterogeneous Distribution Divergence (HDD), specifically designed to identify features driving HTE. HIE quantifies a feature's value based on its ability to induce changes in the optimal arm, while HDD measures its impact on reward distribution divergence across arms. These methods are computationally efficient, robust to model mis-specification, and adaptable to various feature types, making them suitable for rapid screening in dynamic environments where retraining complex models is infeasible. We validate HIE and HDD on synthetic data with known ground truth and in a large-scale commercial recommender system, demonstrating their consistent ability to identify influential HTE features and thereby enhance CMAB performance.