SDLGASAug 28, 2022

Computing with Hypervectors for Efficient Speaker Identification

arXiv:2208.13285v12 citationsh-index: 84
Originality Incremental advance
AI Analysis

This provides an efficient alternative for speaker identification tasks, particularly for resource-constrained environments, though it is incremental compared to existing CNN methods.

The paper tackles speaker identification by using high-dimensional random vectors for a simple and fast method, achieving Top-1 and Top-5 scores of 31% and 52% on VoxCeleb1 with only 1.02k parameters, and improving to 48% and 67% with additional GLVQ training.

We introduce a method to identify speakers by computing with high-dimensional random vectors. Its strengths are simplicity and speed. With only 1.02k active parameters and a 128-minute pass through the training data we achieve Top-1 and Top-5 scores of 31% and 52% on the VoxCeleb1 dataset of 1,251 speakers. This is in contrast to CNN models requiring several million parameters and orders of magnitude higher computational complexity for only a 2$\times$ gain in discriminative power as measured in mutual information. An additional 92 seconds of training with Generalized Learning Vector Quantization (GLVQ) raises the scores to 48% and 67%. A trained classifier classifies 1 second of speech in 5.7 ms. All processing was done on standard CPU-based machines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes