Active Voice Authentication
This addresses the problem of real-time identity verification for users of devices or services, offering an incremental improvement in accuracy for short-duration voice authentication.
The paper tackled reliable speaker verification using very short voice signals, achieving an average equal error rate of 3-4% on a custom dataset and a 3.88% absolute gain over i-vector on the NIST SRE 2001 Dataset with 1-second test segments.
Active authentication refers to a new mode of identity verification in which biometric indicators are continuously tested to provide real-time or near real-time monitoring of an authorized access to a service or use of a device. This is in contrast to the conventional authentication systems where a single test in form of a verification token such as a password is performed. In active voice authentication (AVA), voice is the biometric modality. This paper describes an ensemble of techniques that make reliable speaker verification possible using unconventionally short voice test signals. These techniques include model adaptation and minimum verification error (MVE) training that are tailored for the extremely short training and testing requirements. A database of 25 speakers is recorded for developing this system. In our off-line evaluation on this dataset, the system achieves an average windowed-based equal error rates of 3-4% depending on the model configuration, which is remarkable considering that only 1 second of voice data is used to make every single authentication decision. On the NIST SRE 2001 Dataset, the system provides a 3.88% absolute gain over i-vector when the duration of test segment is 1 second. A real-time demonstration system has been implemented on Microsoft Surface Pro.