ASLGSDMLOct 28, 2017

Attention-Based Models for Text-Dependent Speaker Verification

arXiv:1710.10470v3175 citations
Originality Synthesis-oriented
AI Analysis

This work addresses speaker verification for security applications, but it is incremental as it adapts existing attention methods to a specific domain.

The paper tackles text-dependent speaker verification by applying attention mechanisms to sequence summarization in an end-to-end system, showing that attention-based models improve the Equal Error Rate by 14% relative to a non-attention LSTM baseline.

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end text-dependent speaker recognition system. We explore different topologies and their variants of the attention layer, and compare different pooling methods on the attention weights. Ultimately, we show that attention-based models can improves the Equal Error Rate (EER) of our speaker verification system by relatively 14% compared to our non-attention LSTM baseline model.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes