CVJun 24, 2025

OpenWildlife: Open-Vocabulary Multi-Species Wildlife Detector for Geographically-Diverse Aerial Imagery

arXiv:2506.19204v1h-index: 48
Originality Incremental advance
AI Analysis

This provides a flexible, cost-effective solution for global biodiversity assessments, addressing limitations in existing methods that struggle with generalization across species and environments.

The paper tackles the problem of generalizing wildlife detection across diverse species and environments in aerial imagery by introducing OpenWildlife, an open-vocabulary detector that uses language-aware embeddings and a Grounding-DINO adaptation, achieving up to 0.981 mAP50 with fine-tuning and 0.597 mAP50 on novel species datasets.

We introduce OpenWildlife (OW), an open-vocabulary wildlife detector designed for multi-species identification in diverse aerial imagery. While existing automated methods perform well in specific settings, they often struggle to generalize across different species and environments due to limited taxonomic coverage and rigid model architectures. In contrast, OW leverages language-aware embeddings and a novel adaptation of the Grounding-DINO framework, enabling it to identify species specified through natural language inputs across both terrestrial and marine environments. Trained on 15 datasets, OW outperforms most existing methods, achieving up to \textbf{0.981} mAP50 with fine-tuning and \textbf{0.597} mAP50 on seven datasets featuring novel species. Additionally, we introduce an efficient search algorithm that combines k-nearest neighbors and breadth-first search to prioritize areas where social species are likely to be found. This approach captures over \textbf{95\%} of species while exploring only \textbf{33\%} of the available images. To support reproducibility, we publicly release our source code and dataset splits, establishing OW as a flexible, cost-effective solution for global biodiversity assessments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes