Semi-Supervised Audio Classification with Partially Labeled Data
This work addresses the challenge of expensive and tedious full annotation in audio classification, offering incremental improvements for models by leveraging larger amounts of partially labeled data.
The paper tackles the problem of audio classification with partially labeled datasets by presenting two semi-supervised methods, including a teacher-student process and a mean teacher algorithm, and evaluates them on two public datasets to show the impact of handling missing labels and the benefits of different strategies.
Audio classification has seen great progress with the increasing availability of large-scale datasets. These large datasets, however, are often only partially labeled as collecting full annotations is a tedious and expensive process. This paper presents two semi-supervised methods capable of learning with missing labels and evaluates them on two publicly available, partially labeled datasets. The first method relies on label enhancement by a two-stage teacher-student learning process, while the second method utilizes the mean teacher semi-supervised learning algorithm. Our results demonstrate the impact of improperly handling missing labels and compare the benefits of using different strategies leveraging data with few labels. Methods capable of learning with partially labeled data have the potential to improve models for audio classification by utilizing even larger amounts of data without the need for complete annotations.