SDLGMMMar 27, 2014

Automatic Segmentation of Broadcast News Audio using Self Similarity Matrix

arXiv:1403.6901v13 citations
AI Analysis

This addresses the need for building frugal speech corpora in audio news processing, but it is incremental as it improves on existing sub-optimal methods for a specific domain.

The paper tackles the problem of segmenting audio news broadcasts to align with text transcripts, specifically identifying segments read by the newsreader, and proposes a technique using a Self Similarity Matrix that was verified on real broadcast data from All India Radio.

Generally audio news broadcast on radio is com- posed of music, commercials, news from correspondents and recorded statements in addition to the actual news read by the newsreader. When news transcripts are available, automatic segmentation of audio news broadcast to time align the audio with the text transcription to build frugal speech corpora is essential. We address the problem of identifying segmentation in the audio news broadcast corresponding to the news read by the newsreader so that they can be mapped to the text transcripts. The existing techniques produce sub-optimal solutions when used to extract newsreader read segments. In this paper, we propose a new technique which is able to identify the acoustic change points reliably using an acoustic Self Similarity Matrix (SSM). We describe the two pass technique in detail and verify its performance on real audio news broadcast of All India Radio for different languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes