Uncovering Anomalous Events for Marine Environmental Monitoring via Visual Anomaly Detection
This work addresses the challenge of scalable biodiversity monitoring for marine scientists, but it is incremental as it builds on existing VAD methods with a new dataset and analysis.
The paper tackled the problem of manually inspecting vast underwater video footage for marine biodiversity monitoring by using visual anomaly detection (VAD) with deep neural networks, resulting in the creation of the AURA benchmark dataset and showing that model performance varies significantly with training data and scene variability.
Underwater video monitoring is a promising strategy for assessing marine biodiversity, but the vast volume of uneventful footage makes manual inspection highly impractical. In this work, we explore the use of visual anomaly detection (VAD) based on deep neural networks to automatically identify interesting or anomalous events. We introduce AURA, the first multi-annotator benchmark dataset for underwater VAD, and evaluate four VAD models across two marine scenes. We demonstrate the importance of robust frame selection strategies to extract meaningful video segments. Our comparison against multiple annotators reveals that VAD performance of current models varies dramatically and is highly sensitive to both the amount of training data and the variability in visual content that defines "normal" scenes. Our results highlight the value of soft and consensus labels and offer a practical approach for supporting scientific exploration and scalable biodiversity monitoring.