SD LG ASMay 25, 2023

Ordered and Binary Speaker Embedding

Jiaying Wang, Xianglong Wang, Namin Wang, Lantian Li, Dong Wang

arXiv:2305.16043v12.3

Originality Incremental advance

AI Analysis

This work addresses speaker recognition systems by improving efficiency and structure in embeddings, though it appears incremental as it builds on existing embedding methods.

The paper tackles the problem of dense, non-structural speaker embeddings by proposing an ordered binary embedding approach that sorts dimensions via nested dropout and converts them to binary codes via Bernoulli sampling, resulting in hierarchical clustering, reduced memory usage, and fast retrieval, with empirical verification on speaker identification tasks using VoxCeleb and CN-Celeb datasets.

Modern speaker recognition systems represent utterances by embedding vectors. Conventional embedding vectors are dense and non-structural. In this paper, we propose an ordered binary embedding approach that sorts the dimensions of the embedding vector via a nested dropout and converts the sorted vectors to binary codes via Bernoulli sampling. The resultant ordered binary codes offer some important merits such as hierarchical clustering, reduced memory usage, and fast retrieval. These merits were empirically verified by comprehensive experiments on a speaker identification task with the VoxCeleb and CN-Celeb datasets.

View on arXiv PDF

Similar