MMCVApr 8, 2025

KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection

arXiv:2504.05878v15 citationsh-index: 21ICME
Originality Incremental advance
AI Analysis

This work addresses generalization issues in RGB-T SOD for applications in complex scenarios, representing an incremental improvement through novel integration of foundational models.

The paper tackles the problem of limited generalization in RGB-thermal salient object detection by proposing KAN-SAM, a method that extends SAM2 with thermal prompts via KAN adapters, achieving superior performance over state-of-the-art methods on benchmarks.

Existing RGB-thermal salient object detection (RGB-T SOD) methods aim to identify visually significant objects by leveraging both RGB and thermal modalities to enable robust performance in complex scenarios, but they often suffer from limited generalization due to the constrained diversity of available datasets and the inefficiencies in constructing multi-modal representations. In this paper, we propose a novel prompt learning-based RGB-T SOD method, named KAN-SAM, which reveals the potential of visual foundational models for RGB-T SOD tasks. Specifically, we extend Segment Anything Model 2 (SAM2) for RGB-T SOD by introducing thermal features as guiding prompts through efficient and accurate Kolmogorov-Arnold Network (KAN) adapters, which effectively enhance RGB representations and improve robustness. Furthermore, we introduce a mutually exclusive random masking strategy to reduce reliance on RGB data and improve generalization. Experimental results on benchmarks demonstrate superior performance over the state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes