CmnRec: Sequential Recommendations with Chunk-accelerated Memory Network
This work addresses the challenge of deploying complex memory networks in industrial-scale recommender systems by improving speed without sacrificing accuracy.
The paper tackles the computational inefficiency of Memory-based Neural Recommenders (MNR) in sequential recommendations by introducing a Chunk framework that reduces memory access operations, resulting in up to 7x faster training and 10x faster inference while maintaining competitive performance.
Recently, Memory-based Neural Recommenders (MNR) have demonstrated superior predictive accuracy in the task of sequential recommendations, particularly for modeling long-term item dependencies. However, typical MNR requires complex memory access operations, i.e., both writing and reading via a controller (e.g., RNN) at every time step. Those frequent operations will dramatically increase the network training time, resulting in the difficulty in being deployed on industrial-scale recommender systems. In this paper, we present a novel general Chunk framework to accelerate MNR significantly. Specifically, our framework divides proximal information units into chunks, and performs memory access at certain time steps, whereby the number of memory operations can be greatly reduced. We investigate two ways to implement effective chunking, i.e., PEriodic Chunk (PEC) and Time-Sensitive Chunk (TSC), to preserve and recover important recurrent signals in the sequence. Since chunk-accelerated MNR models take into account more proximal information units than that from a single timestep, it can remove the influence of noise in the item sequence to a large extent, and thus improve the stability of MNR. In this way, the proposed chunk mechanism can lead to not only faster training and prediction, but even slightly better results. The experimental results on three real-world datasets (weishi, ml-10M and ml-latest) show that our chunk framework notably reduces the running time (e.g., with up to 7x for training & 10x for inference on ml-latest) of MNR, and meantime achieves competitive performance.