CLMar 17, 2019

Audio De-identification: A New Entity Recognition Task

Ido Cohn, Itay Laish, Genady Beryozkin, Gang Li, Izhak Shafran, Idan Szpektor, Tzvika Hartman, Avinatan Hassidim, Yossi Matias

arXiv:1903.07037v22.436 citations

Originality Synthesis-oriented

AI Analysis

This addresses privacy concerns in audio data, particularly for healthcare applications, but is incremental as it adapts existing text-based methods to audio.

The paper tackles the problem of de-identifying personal information in audio recordings, such as medical conversations, by defining a new audio de-identification task and presenting a pipeline that achieves results on a new benchmark derived from Switchboard and Fisher datasets.

Named Entity Recognition (NER) has been mostly studied in the context of written text. Specifically, NER is an important step in de-identification (de-ID) of medical records, many of which are recorded conversations between a patient and a doctor. In such recordings, audio spans with personal information should be redacted, similar to the redaction of sensitive character spans in de-ID for written text. The application of NER in the context of audio de-identification has yet to be fully investigated. To this end, we define the task of audio de-ID, in which audio spans with entity mentions should be detected. We then present our pipeline for this task, which involves Automatic Speech Recognition (ASR), NER on the transcript text, and text-to-audio alignment. Finally, we introduce a novel metric for audio de-ID and a new evaluation benchmark consisting of a large labeled segment of the Switchboard and Fisher audio datasets and detail our pipeline's results on it.

View on arXiv PDF

Similar