To remove or not remove Mobile Apps? A data-driven predictive model approach
This work addresses the issue of late app removal for developers and users in mobile app stores, offering an incremental improvement through predictive modeling.
The paper tackles the problem of predicting whether mobile apps will be removed from app stores by proposing a data-driven predictive model using XGBoost classifiers on a dataset of 870,515 apps, achieving AUCs of 0.792 for a user-centered model and 0.762 for a developer-centered model.
Mobile app stores are the key distributors of mobile applications. They regularly apply vetting processes to the deployed apps. Yet, some of these vetting processes might be inadequate or applied late. The late removal of applications might have unpleasant consequences for developers and users alike. Thus, in this work we propose a data-driven predictive approach that determines whether the respective app will be removed or accepted. It also indicates the features' relevance that help the stakeholders in the interpretation. In turn, our approach can support developers in improving their apps and users in downloading the ones that are less likely to be removed. We focus on the Google App store and we compile a new data set of 870,515 applications, 56% of which have actually been removed from the market. Our proposed approach is a bootstrap aggregating of multiple XGBoost machine learning classifiers. We propose two models: user-centered using 47 features, and developer-centered using 37 features, the ones only available before deployment. We achieve the following Areas Under the ROC Curves (AUCs) on the test set: user-centered = 0.792, developer-centered = 0.762.