Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback
This addresses a critical challenge in real-world recommender systems by providing a more reliable method for debiasing feedback, though it appears incremental as it builds on existing tri-training frameworks for a specific domain.
The paper tackles the problem of learning from biased explicit feedback in recommender systems, where data is missing-not-at-random, by proposing an asymmetric tri-training meta-learning method that minimizes a propensity-independent performance bound, achieving robustness against issues like high variance and model choice that affect previous propensity-based approaches.
In most real-world recommender systems, the observed rating data are subject to selection bias, and the data are thus missing-not-at-random. Developing a method to facilitate the learning of a recommender with biased feedback is one of the most challenging problems, as it is widely known that naive approaches under selection bias often lead to suboptimal results. A well-established solution for the problem is using propensity scoring techniques. The propensity score is the probability of each data being observed, and unbiased performance estimation is possible by weighting each data by the inverse of its propensity. However, the performance of the propensity-based unbiased estimation approach is often affected by choice of the propensity estimation model or the high variance problem. To overcome these limitations, we propose a model-agnostic meta-learning method inspired by the asymmetric tri-training framework for unsupervised domain adaptation. The proposed method utilizes two predictors to generate data with reliable pseudo-ratings and another predictor to make the final predictions. In a theoretical analysis, a propensity-independent upper bound of the true performance metric is derived, and it is demonstrated that the proposed method can minimize this bound. We conduct comprehensive experiments using public real-world datasets. The results suggest that the previous propensity-based methods are largely affected by the choice of propensity models and the variance problem caused by the inverse propensity weighting. Moreover, we show that the proposed meta-learning method is robust to these issues and can facilitate in developing effective recommendations from biased explicit feedback.