CVDec 26, 2025
Breaking Alignment Barriers: TPS-Driven Semantic Correlation Learning for Alignment-Free RGB-T Salient Object DetectionLupiao Hu, Fasheng Wang, Fangmei Chen et al.
Existing RGB-T salient object detection methods predominantly rely on manually aligned and annotated datasets, struggling to handle real-world scenarios with raw, unaligned RGB-T image pairs. In practical applications, due to significant cross-modal disparities such as spatial misalignment, scale variations, and viewpoint shifts, the performance of current methods drastically deteriorates on unaligned datasets. To address this issue, we propose an efficient RGB-T SOD method for real-world unaligned image pairs, termed Thin-Plate Spline-driven Semantic Correlation Learning Network (TPS-SCL). We employ a dual-stream MobileViT as the encoder, combined with efficient Mamba scanning mechanisms, to effectively model correlations between the two modalities while maintaining low parameter counts and computational overhead. To suppress interference from redundant background information during alignment, we design a Semantic Correlation Constraint Module (SCCM) to hierarchically constrain salient features. Furthermore, we introduce a Thin-Plate Spline Alignment Module (TPSAM) to mitigate spatial discrepancies between modalities. Additionally, a Cross-Modal Correlation Module (CMCM) is incorporated to fully explore and integrate inter-modal dependencies, enhancing detection performance. Extensive experiments on various datasets demonstrate that TPS-SCL attains state-of-the-art (SOTA) performance among existing lightweight SOD methods and outperforms mainstream RGB-T SOD approaches.
CVJul 31, 2025Code
ST-SAM: SAM-Driven Self-Training Framework for Semi-Supervised Camouflaged Object DetectionXihang Hu, Fuming Sun, Jiazhe Liu et al.
Semi-supervised Camouflaged Object Detection (SSCOD) aims to reduce reliance on costly pixel-level annotations by leveraging limited annotated data and abundant unlabeled data. However, existing SSCOD methods based on Teacher-Student frameworks suffer from severe prediction bias and error propagation under scarce supervision, while their multi-network architectures incur high computational overhead and limited scalability. To overcome these limitations, we propose ST-SAM, a highly annotation-efficient yet concise framework that breaks away from conventional SSCOD constraints. Specifically, ST-SAM employs Self-Training strategy that dynamically filters and expands high-confidence pseudo-labels to enhance a single-model architecture, thereby fundamentally circumventing inter-model prediction bias. Furthermore, by transforming pseudo-labels into hybrid prompts containing domain-specific knowledge, ST-SAM effectively harnesses the Segment Anything Model's potential for specialized tasks to mitigate error accumulation in self-training. Experiments on COD benchmark datasets demonstrate that ST-SAM achieves state-of-the-art performance with only 1\% labeled data, outperforming existing SSCOD methods and even matching fully supervised methods. Remarkably, ST-SAM requires training only a single network, without relying on specific models or loss functions. This work establishes a new paradigm for annotation-efficient SSCOD. Codes will be available at https://github.com/hu-xh/ST-SAM.
CVMay 8, 2020
Sparsely-Labeled Source Assisted Domain AdaptationWei Wang, Zhihui Wang, Yuankai Xiang et al.
Domain Adaptation (DA) aims to generalize the classifier learned from the source domain to the target domain. Existing DA methods usually assume that rich labels could be available in the source domain. However, there are usually a large number of unlabeled data but only a few labeled data in the source domain, and how to transfer knowledge from this sparsely-labeled source domain to the target domain is still a challenge, which greatly limits their application in the wild. This paper proposes a novel Sparsely-Labeled Source Assisted Domain Adaptation (SLSA-DA) algorithm to address the challenge with limited labeled source domain samples. Specifically, due to the label scarcity problem, the projected clustering is conducted on both the source and target domains, so that the discriminative structures of data could be leveraged elegantly. Then the label propagation is adopted to propagate the labels from those limited labeled source samples to the whole unlabeled data progressively, so that the cluster labels are revealed correctly. Finally, we jointly align the marginal and conditional distributions to mitigate the cross-domain mismatch problem, and optimize those three procedures iteratively. However, it is nontrivial to incorporate those three procedures into a unified optimization framework seamlessly since some variables to be optimized are implicitly involved in their formulas, thus they could not promote to each other. Remarkably, we prove that the projected clustering and conditional distribution alignment could be reformulated as different expressions, thus the implicit variables are revealed in different optimization steps. As such, the variables related to those three quantities could be optimized in a unified optimization framework and facilitate to each other, to improve the recognition performance obviously.
LGDec 24, 2019
Importance Filtered Cross-Domain AdaptationWei Wang, Haojie Li, Zhihui Wang et al.
In Domain Adaptation (DA), the category-relevant losses usually occupy a dominant position, while they are usually built with hard or soft labels in existing models. We observed that hard labels are overconfident due to hard samples existed, and soft labels are ambiguous as too many small noisy probabilities involved, and both of them are easily to cause negative transfer. Besides, the category-irrelevant losses in Closed-Set DA (CSDA) paradigm fail to work in Open-Set DA (OSDA), and they also have to be in a category-relevant form, since target data samples are split into shared and private classes. To this end, we propose a newly-unified DA framework (i.e., Importance Filtered Cross-Domain Adaptation, IFCDA). Firstly, an importance filtered mechanism is devised to generate filtered soft labels to mitigate negative transfer desirably. Specifically, the soft labels are divided into confident and ambiguous ones. Then, only the maximum probability in each confident label is retained, and a threshold value is set to truncate each ambiguous label so that only prominent probabilities are reserved. Moreover, a general graph-based label propagation is contrived to attain soft labels in both CSDA and OSDA, where an extra component is embedded into label vector, so that it could detect target novel classes. Finally, the category-relevant losses in both scenarios are reformulated using filtered soft labels, while the category-irrelevant MMD loss in CSDA is reformulated as a form like class-wise MMD using newly-designed importance filtered soft labels. Notably, CSDA paradigm is a special case when all extra components are set to 0, thus the proposed approach is geared to both CSDA and OSDA. Comprehensive experiments on benchmark cross-domain object recognition datasets verify that the proposed approach outperforms several state-of-the-art methods in both scenarios.