Hybrid Disagreement-Diversity Active Learning for Bioacoustic Sound Event Detection
This work addresses data efficiency for biodiversity conservation, particularly in monitoring endangered species, but is incremental as it refines existing methods for a specific domain.
The paper tackles the challenge of limited annotated data in bioacoustic sound event detection by proposing a hybrid active learning method, achieving a mAP of 68% in cold-start and 71% in warm-start scenarios using only 2.3% of annotations, close to the fully-supervised mAP of 75%.
Bioacoustic sound event detection (BioSED) is crucial for biodiversity conservation but faces practical challenges during model development and training: limited amounts of annotated data, sparse events, species diversity, and class imbalance. To address these challenges efficiently with a limited labeling budget, we apply the mismatch-first farthest-traversal (MFFT), an active learning method integrating committee voting disagreement and diversity analysis. We also refine an existing BioSED dataset specifically for evaluating active learning algorithms. Experimental results demonstrate that MFFT achieves a mAP of 68% when cold-starting and 71% when warm-starting (which is close to the fully-supervised mAP of 75%) while using only 2.3% of the annotations. Notably, MFFT excels in cold-start scenarios and with rare species, which are critical for monitoring endangered species, demonstrating its practical value.