IRCVMMSDOct 2, 2020

AVECL-UMONS database for audio-visual event classification and localization

arXiv:2011.01018v13 citations
AI Analysis

This provides a new dataset for researchers in audio-visual machine learning, but it is incremental as it focuses on a specific domain without novel methods.

The authors tackled the problem of audio-visual event classification and localization in office environments by introducing the AVECL-UMons dataset, which includes 11 event classes, 5.24 hours of recordings, and 5386 sequences.

We introduce the AVECL-UMons dataset for audio-visual event classification and localization in the context of office environments. The audio-visual dataset is composed of 11 event classes recorded at several realistic positions in two different rooms. Two types of sequences are recorded according to the number of events in the sequence. The dataset comprises 2662 unilabel sequences and 2724 multilabel sequences corresponding to a total of 5.24 hours. The dataset is publicly accessible online : https://zenodo.org/record/3965492#.X09wsobgrCI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes