On the effectiveness of convolutional autoencoders on image-based personalized recommender systems
This addresses a gap in image-based personalized recommendations for gastronomic platforms, but it is incremental as it adapts existing autoencoder methods to a new domain.
The paper tackles the problem of missing personalized recommender systems in gastronomic platforms like TripAdvisor by using user-tagged images to model tastes, proposing a convolutional autoencoder as a feature extractor and achieving effectiveness compared to standard deep features from convolutional neural networks on data from three cities.
Recommender systems (RS) are increasingly present in our daily lives, especially since the advent of Big Data, which allows for storing all kinds of information about users' preferences. Personalized RS are successfully applied in platforms such as Netflix, Amazon or YouTube. However, they are missing in gastronomic platforms such as TripAdvisor, where moreover we can find millions of images tagged with users' tastes. This paper explores the potential of using those images as sources of information for modeling users' tastes and proposes an image-based classification system to obtain personalized recommendations, using a convolutional autoencoder as feature extractor. The proposed architecture will be applied to TripAdvisor data, using users' reviews that can be defined as a triad composed by a user, a restaurant, and an image of it taken by the user. Since the dataset is highly unbalanced, the use of data augmentation on the minority class is also considered in the experimentation. Results on data from three cities of different sizes (Santiago de Compostela, Barcelona and New York) demonstrate the effectiveness of using a convolutional autoencoder as feature extractor, instead of the standard deep features computed with convolutional neural networks.