Sparse Architectures for Text-Independent Speaker Verification Using Deep Neural Networks
This work addresses computational demands for speaker verification systems, but it is incremental as it applies known pruning techniques to a specific domain.
The paper tackles the problem of computational inefficiency in deep neural networks for text-independent speaker verification by applying structured sparsity to prune unimportant weights, resulting in improved verification performance due to reduced overfitting.
Network pruning is of great importance due to the elimination of the unimportant weights or features activated due to the network over-parametrization. Advantages of sparsity enforcement include preventing the overfitting and speedup. Considering a large number of parameters in deep architectures, network compression becomes of critical importance due to the required huge amount of computational power. In this work, we impose structured sparsity for speaker verification which is the validation of the query speaker compared to the speaker gallery. We will show that the mere sparsity enforcement can improve the verification results due to the possible initial overfitting in the network.