SD LG ASSep 19, 2021

ARCA23K: An audio dataset for investigating open-set label noise

Turab Iqbal, Yin Cao, Andrew Bailey, Mark D. Plumbley, Wenwu Wang

arXiv:2109.09227v27.38 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This dataset addresses the problem of label noise in audio data for researchers, but it is incremental as it builds on existing datasets like FSDKaggle2018 and FSDnoisy18K.

The authors introduced ARCA23K, a dataset of over 23,000 labeled audio clips from Freesound, to investigate open-set label noise, where most errors arise from out-of-vocabulary clips, and they studied its impact on classification performance and representation learning.

The availability of audio data on sound sharing platforms such as Freesound gives users access to large amounts of annotated audio. Utilising such data for training is becoming increasingly popular, but the problem of label noise that is often prevalent in such datasets requires further investigation. This paper introduces ARCA23K, an Automatically Retrieved and Curated Audio dataset comprised of over 23000 labelled Freesound clips. Unlike past datasets such as FSDKaggle2018 and FSDnoisy18K, ARCA23K facilitates the study of label noise in a more controlled manner. We describe the entire process of creating the dataset such that it is fully reproducible, meaning researchers can extend our work with little effort. We show that the majority of labelling errors in ARCA23K are due to out-of-vocabulary audio clips, and we refer to this type of label noise as open-set label noise. Experiments are carried out in which we study the impact of label noise in terms of classification performance and representation learning.

View on arXiv PDF Code

Similar