CVJun 10, 2024

UEMM-Air: Make Unmanned Aerial Vehicles Perform More Multi-modal Tasks

arXiv:2406.06230v34 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem of limited and costly multi-modal data for UAV researchers, though it is incremental as it builds on synthetic data generation methods.

The authors tackled the lack of high-quality multi-modal datasets for Unmanned Aerial Vehicles (UAVs) by creating UEMM-Air, a synthetic dataset with 120k image pairs across 6 modalities and precise annotations, which improves model performance on downstream tasks compared to existing datasets.

The development of multi-modal learning for Unmanned Aerial Vehicles (UAVs) typically relies on a large amount of pixel-aligned multi-modal image data. However, existing datasets face challenges such as limited modalities, high construction costs, and imprecise annotations. To this end, we propose a synthetic multi-modal UAV-based multi-task dataset, UEMM-Air. Specifically, we simulate various UAV flight scenarios and object types using the Unreal Engine (UE). Then we design the UAV's flight logic to automatically collect data from different scenarios, perspectives, and altitudes. Furthermore, we propose a novel heuristic automatic annotation algorithm to generate accurate object detection labels. Finally, we utilize labels to generate text descriptions of images to make our UEMM-Air support more cross-modality tasks. In total, our UEMM-Air consists of 120k pairs of images with 6 modalities and precise annotations. Moreover, we conduct numerous experiments and establish new benchmark results on our dataset. We also found that models pre-trained on UEMM-Air exhibit better performance on downstream tasks compared to other similar datasets. The dataset is publicly available (https://github.com/1e12Leon/UEMM-Air) to support the research of multi-modal tasks on UAVs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes