ASSDSPFeb 4, 2020

Audio-Visual Calibration with Polynomial Regression for 2-D Projection Using SVD-PHAT

arXiv:2002.01440v21 citations
Originality Incremental advance
AI Analysis

This addresses the need for accurate audio-visual alignment in applications like surveillance or human-computer interaction, but it is incremental as it builds on existing methods like SVD-PHAT.

The paper tackles the problem of spatially calibrating a camera's visual field with a microphone array's auditory field by overlaying an acoustic image on an optical image, achieving efficient calibration using polynomial regression to handle non-linear distortion and adapting the SVD-PHAT method for real-time sound source localization.

This paper proposes a straightforward 2-D method to spatially calibrate the visual field of a camera with the auditory field of an array microphone by generating and overlaying an acoustic image over an optical image. Using a low-cost microphone array and an off-the-shelf camera, we show that polynomial regression can deal efficiently with non-linear camera distortion, and that a recently proposed sound source localization method for real-time processing, SVD-PHAT, can be adapted for this task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes