IR LGJun 9, 2020

I know why you like this movie: Interpretable Efficient Multimodal Recommender

Barbara Rychalska, Dominika Basaj, Jacek Dąbrowski, Michał Daniluk

arXiv:2006.09979v13.02 citations

Originality Incremental advance

AI Analysis

This provides interpretability for a state-of-the-art multimodal recommender system, addressing a bottleneck in understanding model decisions for users and developers.

The paper tackles the problem of interpreting the Efficient Manifold Density Estimator (EMDE) model for multimodal movie recommendations, proving that white-box interpretation is possible and showing the influence of text, categorical features, and images on recommendations.

Recently, the Efficient Manifold Density Estimator (EMDE) model has been introduced. The model exploits Local Sensitive Hashing and Count-Min Sketch algorithms, combining them with a neural network to achieve state-of-the-art results on multiple recommender datasets. However, this model ingests a compressed joint representation of all input items for each user/session, so calculating attributions for separate items via gradient-based methods seems not applicable. We prove that interpreting this model in a white-box setting is possible thanks to the properties of EMDE item retrieval method. By exploiting multimodal flexibility of this model, we obtain meaningful results showing the influence of multiple modalities: text, categorical features, and images, on movie recommendation output.

View on arXiv PDF

Similar