AS SDNov 17, 2020

Ultra-Lightweight Speech Separation via Group Communication

arXiv:2011.08397v313.833 citations

Originality Incremental advance

AI Analysis

This addresses the problem of model size for speech enhancement on low-resource devices like hearing aids, representing an incremental improvement in lightweight model design.

The paper tackles the challenge of deploying speech separation models on low-resource devices by introducing an ultra-lightweight design paradigm called GroupComm, which achieves performance comparable to a strong baseline with 35.6 times fewer parameters and 2.3 times fewer operations.

Model size and complexity remain the biggest challenges in the deployment of speech enhancement and separation systems on low-resource devices such as earphones and hearing aids. Although methods such as compression, distillation and quantization can be applied to large models, they often come with a cost on the model performance. In this paper, we provide a simple model design paradigm that explicitly designs ultra-lightweight models without sacrificing the performance. Motivated by the sub-band frequency-LSTM (F-LSTM) architectures, we introduce the group communication (GroupComm), where a feature vector is split into smaller groups and a small processing block is used to perform inter-group communication. Unlike standard F-LSTM models where the sub-band outputs are concatenated, an ultra-small module is applied on all the groups in parallel, which allows a significant decrease on the model size. Experiment results show that comparing with a strong baseline model which is already lightweight, GroupComm can achieve on par performance with 35.6 times fewer parameters and 2.3 times fewer operations.

View on arXiv PDF

Similar