SDLGASSep 11, 2020

SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

arXiv:2009.05188v147 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses the problem of urban noise monitoring for researchers and practitioners by providing spatiotemporal context, though it is incremental as it builds on existing urban sound datasets.

The authors introduced SONYC-UST-V2, a dataset of 18,510 urban audio recordings with spatiotemporal metadata to aid in sound tagging, and reported results from a baseline model using this information.

We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC-UST-V2 consists of 18510 audio recordings from the "Sounds of New York City" (SONYC) acoustic sensor network, including the timestamp of audio acquisition and location of the sensor. The dataset contains annotations by volunteers from the Zooniverse citizen science platform, as well as a two-stage verification with our team. In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags. We report the results of a simple baseline model that exploits spatiotemporal information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes