SDMMJun 9, 2016

Audio Content based Geotagging in Multimedia

arXiv:1606.02816v27 citations
AI Analysis

This addresses geotagging for multimedia applications, but appears incremental as it applies existing matrix factorization techniques to a new audio-based context.

The paper tackles the problem of geotagging multimedia recordings by inferring location from the composition of sound events in audio, using matrix factorization to extract semantic sound classes and combining them to identify the recording's location.

In this paper we propose methods to extract geographically relevant information in a multimedia recording using its audio. Our method primarily is based on the fact that urban acoustic environment consists of a variety of sounds. Hence, location information can be inferred from the composition of sound events/classes present in the audio. More specifically, we adopt matrix factorization techniques to obtain semantic content of recording in terms of different sound classes. These semantic information are then combined to identify the location of recording.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes