Causal Embeddings for Recommendation
This addresses the problem of aligning recommendation systems with business goals like increasing sales or engagement, rather than just predicting user preferences, which is an incremental advance in causal recommendation.
The paper tackles the gap between recommendation objectives that aim to modify user behavior and classical methods that predict past behavior, by optimizing a policy to increase desired outcomes versus organic behavior. It proposes a domain adaptation algorithm that learns from biased logged data to predict outcomes under random exposure, showing significant improvements over state-of-the-art methods.
Many current applications use recommendations in order to modify the natural user behavior, such as to increase the number of sales or the time spent on a website. This results in a gap between the final recommendation objective and the classical setup where recommendation candidates are evaluated by their coherence with past user behavior, by predicting either the missing entries in the user-item matrix, or the most likely next event. To bridge this gap, we optimize a recommendation policy for the task of increasing the desired outcome versus the organic user behavior. We show this is equivalent to learning to predict recommendation outcomes under a fully random recommendation policy. To this end, we propose a new domain adaptation algorithm that learns from logged data containing outcomes from a biased recommendation policy and predicts recommendation outcomes according to random exposure. We compare our method against state-of-the-art factorization methods, in addition to new approaches of causal recommendation and show significant improvements.