The effects of anger on automated long-term-spectra based speaker-identification
This highlights a critical limitation for forensic applications, cautioning against the use of this tool in cases involving emotional speech.
The study found that anger significantly distorts acoustic signals in long-term spectra analysis for forensic speaker identification, causing a 33% misidentification rate towards a different speaker.
Forensic speaker identification has traditionally considered approaches based on long term spectra analysis as especially robust, given that they work well for short recordings, are not sensitive to changes in the intensity of the sample, and continue to function in the presence of noise and limited passband. We find, however, that anger induces a significant distortion of the acoustic signal for long term spectra analysis purposes. Even moderate anger offsets speaker identification results by 33% in the direction of a different speaker altogether. Thus, caution should be exercised when applying this tool.