SDDBApr 11, 2012

Employing Subsequence Matching in Audio Data Processing

arXiv:1204.2541v14 citations
Originality Synthesis-oriented
AI Analysis

This work addresses audio data processing challenges, particularly in automatic speech recognition, but appears incremental as it builds on existing frameworks and methods.

The paper tackles the problem of audio retrieval and time-series subsequence matching by presenting a Subsequence Matching Framework built on MESSIF to improve query processing performance, with a proof-of-concept application for spoken term detection using phonetic posteriograms.

We overview current problems of audio retrieval and time-series subsequence matching. We discuss the usage of subsequence matching approaches in audio data processing, especially in automatic speech recognition (ASR) area and we aim at improving performance of the retrieval process. To overcome the problems known from the time-series area like the occurrence of implementation bias and data bias we present a Subsequence Matching Framework as a tool for fast prototyping, building, and testing similarity search subsequence matching applications. The framework is build on top of MESSIF (Metric Similarity Search Implementation Framework) and thus the subsequence matching algorithms can exploit advanced similarity indexes in order to significantly increase their query processing performance. To prove our concept we provide a design of query-by-example spoken term detection type of application with the usage of phonetic posteriograms and subsequence matching approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes