SD LG ASOct 27, 2021

Temporal Knowledge Distillation for On-device Audio Classification

Kwanghee Choi, Martin Kersner, Jacob Morton, Buru Chang

arXiv:2110.14131v216.232 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of enhancing audio classification for mobile devices, offering a novel distillation approach that is incremental in improving existing methods.

The paper tackles the challenge of improving on-device audio classification models by proposing a new knowledge distillation method that incorporates temporal information from transformer-based models, applicable to various architectures like CNNs and RNNs, and shows improved predictive performance in experiments on audio event detection and noisy keyword spotting datasets.

Improving the performance of on-device audio classification models remains a challenge given the computational limits of the mobile environment. Many studies leverage knowledge distillation to boost predictive performance by transferring the knowledge from large models to on-device models. However, most lack a mechanism to distill the essence of the temporal information, which is crucial to audio classification tasks, or similar architecture is often required. In this paper, we propose a new knowledge distillation method designed to incorporate the temporal knowledge embedded in attention weights of large transformer-based models into on-device models. Our distillation method is applicable to various types of architectures, including the non-attention-based architectures such as CNNs or RNNs, while retaining the original network architecture during inference. Through extensive experiments on both an audio event detection dataset and a noisy keyword spotting dataset, we show that our proposed method improves the predictive performance across diverse on-device architectures.

View on arXiv PDF

Similar