CVJan 29, 2018

Local Visual Microphones: Improved Sound Extraction from Silent Video

arXiv:1801.09436v15 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of sound extraction for applications in surveillance or multimedia, but it is incremental as it builds on existing techniques.

The paper tackled the problem of extracting sound from silent video by analyzing local vibration patterns at different image locations, resulting in improved sound quality that surpasses state-of-the-art methods and enabling real-time performance with a speedup of two to three orders of magnitude.

Sound waves cause small vibrations in nearby objects. A few techniques exist in the literature that can extract sound from video. In this paper we study local vibration patterns at different image locations. We show that different locations in the image vibrate differently. We carefully aggregate local vibrations and produce a sound quality that improves state-of-the-art. We show that local vibrations could have a time delay because sound waves take time to travel through the air. We use this phenomenon to estimate sound direction. We also present a novel algorithm that speeds up sound extraction by two to three orders of magnitude and reaches real-time performance in a 20KHz video.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes