CVMay 24, 2024

Multimodal Object Detection via Probabilistic a priori Information Integration

arXiv:2405.15596v1h-index: 10
Originality Incremental advance
AI Analysis

This work addresses multimodal object detection in remote sensing, where modalities are not strictly aligned, offering a solution for scenarios where only one modality contains the target object.

The paper tackled the problem of low-quality multimodal data with misaligned modalities in remote sensing object detection by converting contextual binary information into probability maps and using an early fusion architecture, achieving validation through extensive experiments on the DOTA dataset.

Multimodal object detection has shown promise in remote sensing. However, multimodal data frequently encounter the problem of low-quality, wherein the modalities lack strict cell-to-cell alignment, leading to mismatch between different modalities. In this paper, we investigate multimodal object detection where only one modality contains the target object and the others provide crucial contextual information. We propose to resolve the alignment problem by converting the contextual binary information into probability maps. We then propose an early fusion architecture that we validate with extensive experiments on the DOTA dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes