SDASMLDec 12, 2017

Multi-Speaker Localization Using Convolutional Neural Network Trained with Noise

arXiv:1712.04276v151 citations
Originality Incremental advance
AI Analysis

This addresses the problem of accurately locating multiple speakers in noisy environments for applications like audio processing and robotics, but it appears incremental as it builds on existing CNN methods with a specific training approach.

The paper tackles multi-speaker localization by formulating it as a multi-class multi-label classification problem and using a convolutional neural network trained with synthesized noise signals, achieving results compared to a steered response power method.

The problem of multi-speaker localization is formulated as a multi-class multi-label classification problem, which is solved using a convolutional neural network (CNN) based source localization method. Utilizing the common assumption of disjoint speaker activities, we propose a novel method to train the CNN using synthesized noise signals. The proposed localization method is evaluated for two speakers and compared to a well-known steered response power method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes