IR AI LGFeb 20, 2022

Cross-Task Knowledge Distillation in Multi-Task Recommendation

Chenxiao Yang, Junwei Pan, Xiaofeng Gao, Tingyu Jiang, Dapeng Liu, Guihai Chen

arXiv:2202.09852v217.059 citationsh-index: 63

Originality Incremental advance

AI Analysis

This work addresses the challenge of enhancing multi-task learning in recommender systems by enabling knowledge transfer between tasks, which is incremental as it builds on existing MTL and KD methods.

The paper tackles the problem of transferring task-specific knowledge between different user feedback predictions in multi-task recommendation systems, proposing a Cross-Task Knowledge Distillation framework that improves performance by leveraging prediction results as supervised signals, with experiments showing significant gains in metrics like AUC and NDCG on real-world datasets.

Multi-task learning (MTL) has been widely used in recommender systems, wherein predicting each type of user feedback on items (e.g, click, purchase) are treated as individual tasks and jointly trained with a unified model. Our key observation is that the prediction results of each task may contain task-specific knowledge about user's fine-grained preference towards items. While such knowledge could be transferred to benefit other tasks, it is being overlooked under the current MTL paradigm. This paper, instead, proposes a Cross-Task Knowledge Distillation framework that attempts to leverage prediction results of one task as supervised signals to teach another task. However, integrating MTL and KD in a proper manner is non-trivial due to several challenges including task conflicts, inconsistent magnitude and requirement of synchronous optimization. As countermeasures, we 1) introduce auxiliary tasks with quadruplet loss functions to capture cross-task fine-grained ranking information and avoid task conflicts, 2) design a calibrated distillation approach to align and distill knowledge from auxiliary tasks, and 3) propose a novel error correction mechanism to enable and facilitate synchronous training of teacher and student models. Comprehensive experiments are conducted to verify the effectiveness of our framework in real-world datasets.

View on arXiv PDF

Similar