CVSep 3, 2024

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection

arXiv:2409.01686v1102 citationsh-index: 30Has Code
Originality Highly original
AI Analysis

This addresses the problem of detecting camouflaged objects in computer vision, which is challenging due to high similarity with surroundings, representing an incremental improvement through novel domain integration.

The paper tackles camouflaged object detection by proposing a Frequency-Spatial Entanglement Learning (FSEL) method that jointly explores frequency and spatial domain representations, achieving superior performance over 21 state-of-the-art methods on three datasets.

Camouflaged object detection has attracted a lot of attention in computer vision. The main challenge lies in the high degree of similarity between camouflaged objects and their surroundings in the spatial domain, making identification difficult. Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design, but often ignore the sensitivity and locality of features in the spatial domain, leading to sub-optimal results. In this paper, we propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method. This method consists of a series of well-designed Entanglement Transformer Blocks (ETB) for representation learning, a Joint Domain Perception Module for semantic enhancement, and a Dual-domain Reverse Parser for feature integration in the frequency and spatial domains. Specifically, the ETB utilizes frequency self-attention to effectively characterize the relationship between different frequency bands, while the entanglement feed-forward network facilitates information interaction between features of different domains through entanglement learning. Our extensive experiments demonstrate the superiority of our FSEL over 21 state-of-the-art methods, through comprehensive quantitative and qualitative comparisons in three widely-used datasets. The source code is available at: https://github.com/CSYSI/FSEL.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes