Exploring Feature-based Knowledge Distillation for Recommender System: A Frequency Perspective
This work addresses inefficiencies in knowledge distillation for recommender systems, offering a domain-specific improvement.
The paper tackles the problem of feature-based knowledge distillation in recommender systems by analyzing it from a frequency perspective, showing that equal loss weight allocation overlooks important knowledge, and proposes FreqD, a lightweight reweighting method that significantly outperforms state-of-the-art methods in experiments.
In this paper, we analyze the feature-based knowledge distillation for recommendation from the frequency perspective. By defining knowledge as different frequency components of the features, we theoretically demonstrate that regular feature-based knowledge distillation is equivalent to equally minimizing losses on all knowledge and further analyze how this equal loss weight allocation method leads to important knowledge being overlooked. In light of this, we propose to emphasize important knowledge by redistributing knowledge weights. Furthermore, we propose FreqD, a lightweight knowledge reweighting method, to avoid the computational cost of calculating losses on each knowledge. Extensive experiments demonstrate that FreqD consistently and significantly outperforms state-of-the-art knowledge distillation methods for recommender systems. Our code is available at https://github.com/woriazzc/KDs.