Weakly Supervised Detection of Baby Cry
This addresses baby monitoring and healthcare needs with a more efficient annotation process, though it is incremental as it builds on existing anomaly detection techniques.
The paper tackles the problem of detecting baby cries in audio by proposing a weakly supervised anomaly detection method, achieving improved performance over existing supervised approaches.
Detection of baby cries is an important part of baby monitoring and health care. Almost all existing methods use supervised SVM, CNN, or their varieties. In this work, we propose to use weakly supervised anomaly detection to detect a baby cry. In this weak supervision, we only need weak annotation if there is a cry in an audio file. We design a data mining technique using the pre-trained VGGish feature extractor and an anomaly detection network on long untrimmed audio files. The obtained datasets are used to train a simple CNN feature network for cry/non-cry classification. This CNN is then used as a feature extractor in an anomaly detection framework to achieve better cry detection performance.