CVMMSDASApr 30, 2024

SemiPL: A Semi-supervised Method for Event Sound Source Localization

arXiv:2404.19615v11 citationsh-index: 3Has Code2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
Originality Synthesis-oriented
AI Analysis

This work addresses localization in chaotic events for applications like crowd management and emergency response, but it is incremental as it builds on existing methods with parameter adjustments and semi-supervised enhancements.

The paper tackles event sound source localization by applying an existing model to a more complex dataset and proposing a semi-supervised improvement method called SemiPL, achieving improvements of 12.2% cIoU and 0.56% AUC on the Chaotic World dataset.

In recent years, Event Sound Source Localization has been widely applied in various fields. Recent works typically relying on the contrastive learning framework show impressive performance. However, all work is based on large relatively simple datasets. It's also crucial to understand and analyze human behaviors (actions and interactions of people), voices, and sounds in chaotic events in many applications, e.g., crowd management, and emergency response services. In this paper, we apply the existing model to a more complex dataset, explore the influence of parameters on the model, and propose a semi-supervised improvement method SemiPL. With the increase in data quantity and the influence of label quality, self-supervised learning will be an unstoppable trend. The experiment shows that the parameter adjustment will positively affect the existing model. In particular, SSPL achieved an improvement of 12.2% cIoU and 0.56% AUC in Chaotic World compared to the results provided. The code is available at: https://github.com/ly245422/SSPL

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes