Training neural audio classifiers with few data
This work addresses the challenge of data scarcity in audio classification, which is incremental as it compares existing methods rather than introducing a new paradigm.
The paper tackled the problem of training neural audio classifiers with limited annotated data by evaluating strategies like regularization, prototypical networks, and transfer learning, finding that transfer learning is effective but prototypical networks show promise when external data is unavailable.
We investigate supervised learning strategies that improve the training of neural network audio classifiers on small annotated collections. In particular, we study whether (i) a naive regularization of the solution space, (ii) prototypical networks, (iii) transfer learning, or (iv) their combination, can foster deep learning models to better leverage a small amount of training examples. To this end, we evaluate (i-iv) for the tasks of acoustic event recognition and acoustic scene classification, considering from 1 to 100 labeled examples per class. Results indicate that transfer learning is a powerful strategy in such scenarios, but prototypical networks show promising results when one does not count with external or validation data.