SDLGASApr 6, 2021

Binary Neural Network for Speaker Verification

arXiv:2104.02306v112 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of computational and memory inefficiency for speaker verification on embedded devices, representing an incremental improvement through binarization of existing networks.

The paper tackles the challenge of deploying high-performance speaker verification systems on low-resource embedded devices by applying binary neural networks, achieving an EER of around 5% on Voxceleb1 and outperforming real number networks on the Xiaole dataset with 32x memory savings.

Although deep neural networks are successful for many tasks in the speech domain, the high computational and memory costs of deep neural networks make it difficult to directly deploy highperformance Neural Network systems on low-resource embedded devices. There are several mechanisms to reduce the size of the neural networks i.e. parameter pruning, parameter quantization, etc. This paper focuses on how to apply binary neural networks to the task of speaker verification. The proposed binarization of training parameters can largely maintain the performance while significantly reducing storage space requirements and computational costs. Experiment results show that, after binarizing the Convolutional Neural Network, the ResNet34-based network achieves an EER of around 5% on the Voxceleb1 testing dataset and even outperforms the traditional real number network on the text-dependent dataset: Xiaole while having a 32x memory saving.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes