REMEDI: Relative Feature Enhanced Meta-Learning with Distillation for Imbalanced Prediction
This addresses a critical challenge in industry settings for businesses needing to target potential buyers from imbalanced data, though it appears incremental as it builds on existing meta-learning and distillation techniques.
The paper tackled predicting future vehicle purchases among existing owners with extreme class imbalance (<0.5% positive rate) by proposing REMEDI, a multi-stage framework that achieved the business target of identifying ~50% of actual buyers within the top 60,000 recommendations at ~10% precision.
Predicting future vehicle purchases among existing owners presents a critical challenge due to extreme class imbalance (<0.5% positive rate) and complex behavioral patterns. We propose REMEDI (Relative feature Enhanced Meta-learning with Distillation for Imbalanced prediction), a novel multi-stage framework addressing these challenges. REMEDI first trains diverse base models to capture complementary aspects of user behavior. Second, inspired by comparative op-timization techniques, we introduce relative performance meta-features (deviation from ensemble mean, rank among peers) for effective model fusion through a hybrid-expert architecture. Third, we distill the ensemble's knowledge into a single efficient model via supervised fine-tuning with MSE loss, enabling practical deployment. Evaluated on approximately 800,000 vehicle owners, REMEDI significantly outperforms baseline approaches, achieving the business target of identifying ~50% of actual buyers within the top 60,000 recommendations at ~10% precision. The distilled model preserves the ensemble's predictive power while maintaining deployment efficiency, demonstrating REMEDI's effectiveness for imbalanced prediction in industry settings.