IR LGDec 14, 2020

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

Jiachun Wang, Fajie Yuan, Jian Chen, Qingyao Wu, Min Yang, Yang Sun, Guoxiao Zhang

arXiv:2012.07598v210.427 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides a more efficient training method for deep sequential recommender models, which is beneficial for researchers and practitioners dealing with large-scale recommendation systems.

This paper addresses the challenge of training very deep sequential recommender models, which can have up to 100 layers and tens of billions of user-item interactions, leading to high computational costs. The authors propose StackRec, an iterative layer stacking framework that transfers knowledge from shallower pre-trained models to deeper ones, resulting in comparable performance to models trained from scratch but with substantial acceleration in training time.

Deep learning has brought great progress for the sequential recommendation (SR) tasks. With advanced network architectures, sequential recommender models can be stacked with many hidden layers, e.g., up to 100 layers on real-world recommendation datasets. Training such a deep network is difficult because it can be computationally very expensive and takes much longer time, especially in situations where there are tens of billions of user-item interactions. To deal with such a challenge, we present StackRec, a simple, yet very effective and efficient training framework for deep SR models by iterative layer stacking. Specifically, we first offer an important insight that hidden layers/blocks in a well-trained deep SR model have very similar distributions. Enlightened by this, we propose the stacking operation on the pre-trained layers/blocks to transfer knowledge from a shallower model to a deep model, then we perform iterative stacking so as to yield a much deeper but easier-to-train SR model. We validate the performance of StackRec by instantiating it with four state-of-the-art SR models in three practical scenarios with real-world datasets. Extensive experiments show that StackRec achieves not only comparable performance, but also substantial acceleration in training time, compared to SR models that are trained from scratch. Codes are available at https://github.com/wangjiachun0426/StackRec.

View on arXiv PDF Code

Similar