IR LG MLAug 18, 2020

Learning the Structure of Auto-Encoding Recommenders

Farhan Khawar, Leonard Kin Man Poon, Nevin Lianwen Zhang

arXiv:2008.07956v121 citations

Originality Incremental advance

AI Analysis

This addresses the problem of inefficient training and reduced performance in recommender systems for users in sparse data scenarios, representing an incremental improvement over existing autoencoder methods.

The paper tackles the inefficiency and poor generalization of fully-connected autoencoder recommenders in sparse collaborative filtering data by introducing structure learning that groups related items to create a sparsely connected network. The result is a method that outperforms state-of-the-art models like Mult-VAE/Mult-DAE on multiple benchmarks, with better cold-start performance and improved convergence.

Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collaborative filtering. The aforementioned results in lower generalization ability and reduced performance. In this paper, we introduce structure learning for autoencoder recommenders by taking advantage of the inherent item groups present in the collaborative filtering domain. Due to the nature of items in general, we know that certain items are more related to each other than to other items. Based on this, we propose a method that first learns groups of related items and then uses this information to determine the connectivity structure of an auto-encoding neural network. This results in a network that is sparsely connected. This sparse structure can be viewed as a prior that guides the network training. Empirically we demonstrate that the proposed structure learning enables the autoencoder to converge to a local optimum with a much smaller spectral norm and generalization error bound than the fully-connected network. The resultant sparse network considerably outperforms the state-of-the-art methods like \textsc{Mult-vae/Mult-dae} on multiple benchmarked datasets even when the same number of parameters and flops are used. It also has a better cold-start performance.

View on arXiv PDF

Similar