IRApr 15
From Transfer to Collaboration: A Federated Framework for Cross-Market Sequential RecommendationJundong Chen, Honglei Zhang, Xiangmou Qu et al.
Cross-market recommendation (CMR) aims to enhance recommendation performance across multiple markets. Due to its inherent characteristics, i.e., data isolation, non-overlapping users, and market heterogeneity, CMR introduces unique challenges and fundamentally differs from cross-domain recommendation (CDR). Existing CMR approaches largely inherit CDR by adopting the one-to-one transfer paradigm, where a model is pretrained on a source market and then fine-tuned on a target market. However, such a paradigm suffers from CH1. source degradation, where the source market sacrifices its own performance for the target markets, and CH2. negative transfer, where market heterogeneity leads to suboptimal performance in target markets. To address these challenges, we propose FeCoSR, a novel federated collaboration framework for cross-market sequential recommendation. Specifically, to tackle CH1, we introduce a many-to-many collaboration paradigm that enables all markets to jointly participate in and benefit from training. It consists of a federated pretraining stage for capturing shared behavior-level patterns, followed by local fine-tuning for market-specific item-level preferences. For CH2, we theoretically and empirically show that vanilla Cross-Entropy (CE) exacerbates market heterogeneity, undermining federated optimization. To address this, we propose a Semantic Soft Cross-Entropy (S^2CE) that leverages shared semantic information to facilitate collaborative behavioral learning across markets. Then, we design a market-specific adaptation module during fine-tuning to capture local item preferences. Extensive experiments on the real-world datasets demonstrate the advantages of FeCoSR over other methods.
CVMar 9, 2024
Deep learning for multi-label classification of coral conditions in the Indo-Pacific via underwater photogrammetryXinlei Shao, Hongruixuan Chen, Kirsty Magson et al.
Since coral reef ecosystems face threats from human activities and climate change, coral conservation programs are implemented worldwide. Monitoring coral health provides references for guiding conservation activities. However, current labor-intensive methods result in a backlog of unsorted images, highlighting the need for automated classification. Few studies have simultaneously utilized accurate annotations along with updated algorithms and datasets. This study aimed to create a dataset representing common coral conditions and associated stressors in the Indo-Pacific. Concurrently, it assessed existing classification algorithms and proposed a new multi-label method for automatically detecting coral conditions and extracting ecological information. A dataset containing over 20,000 high-resolution coral images of different health conditions and stressors was constructed based on the field survey. Seven representative deep learning architectures were tested on this dataset, and their performance was quantitatively evaluated using the F1 metric and the match ratio. Based on this evaluation, a new method utilizing the ensemble learning approach was proposed. The proposed method accurately classified coral conditions as healthy, compromised, dead, and rubble; it also identified corresponding stressors, including competition, disease, predation, and physical issues. This method can help develop the coral image archive, guide conservation activities, and provide references for decision-making for reef managers and conservationists. The proposed ensemble learning approach outperforms others on the dataset, showing State-Of-The-Art (SOTA) performance. Future research should improve its generalizability and accuracy to support global coral conservation efforts.
CVMar 29, 2025
Multi-label classification for multi-temporal, multi-spatial coral reef condition monitoring using vision foundation model with adapter learningXinlei Shao, Hongruixuan Chen, Fan Zhao et al.
Coral reef ecosystems provide essential ecosystem services, but face significant threats from climate change and human activities. Although advances in deep learning have enabled automatic classification of coral reef conditions, conventional deep models struggle to achieve high performance when processing complex underwater ecological images. Vision foundation models, known for their high accuracy and cross-domain generalizability, offer promising solutions. However, fine-tuning these models requires substantial computational resources and results in high carbon emissions. To address these challenges, adapter learning methods such as Low-Rank Adaptation (LoRA) have emerged as a solution. This study introduces an approach integrating the DINOv2 vision foundation model with the LoRA fine-tuning method. The approach leverages multi-temporal field images collected through underwater surveys at 15 dive sites at Koh Tao, Thailand, with all images labeled according to universal standards used in citizen science-based conservation programs. The experimental results demonstrate that the DINOv2-LoRA model achieved superior accuracy, with a match ratio of 64.77%, compared to 60.34% achieved by the best conventional model. Furthermore, incorporating LoRA reduced the trainable parameters from 1,100M to 5.91M. Transfer learning experiments conducted under different temporal and spatial settings highlight the exceptional generalizability of DINOv2-LoRA across different seasons and sites. This study is the first to explore the efficient adaptation of foundation models for multi-label classification of coral reef conditions under multi-temporal and multi-spatial settings. The proposed method advances the classification of coral reef conditions and provides a tool for monitoring, conserving, and managing coral reef ecosystems.
IRJan 29
FedUTR: Federated Recommendation with Augmented Universal Textual Representation for Sparse Interaction ScenariosKang Fu, Honglei Zhang, Zikai Zhang et al.
Federated recommendations (FRs) have emerged as an on-device privacy-preserving paradigm, attracting considerable attention driven by rising demands for data security. Existing FRs predominantly adapt ID embeddings to represent items, making the quality of item embeddings entirely dependent on users' historical behaviors. However, we empirically observe that this pattern leads to suboptimal recommendation performance under high data sparsity scenarios, due to its strong reliance on historical interactions. To address this issue, we propose a novel method named FedUTR, which incorporates item textual representations as a complement to interaction behaviors, aiming to enhance model performance under high data sparsity. Specifically, we utilize textual modality as the universal representation to capture generic item knowledge, and design a Collaborative Information Fusion Module (CIFM) to complement each user's personalized interaction information. Besides, we introduce a Local Adaptation Module (LAM) that adaptively exploits the off-the-shelf local model to efficiently preserve client-specific personalized preferences. Moreover, we propose a variant of FedUTR, termed FedUTR-SAR, which incorporates a sparsity-aware resnet component to granularly balance universal and personalized information. The convergence analysis proves theoretical guarantees for the effectiveness of FedUTR. Extensive experiments on four real-world datasets show that our method achieves superior performance, with improvements of up to 59% across all datasets compared to the SOTA baselines.
ITOct 23, 2025
Dual-Domain Deep Learning-Assisted NOMA-CSK Systems for Secure and Efficient Vehicular CommunicationsTingting Huang, Jundong Chen, Huanqiang Zeng et al.
Ensuring secure and efficient multi-user (MU) transmission is critical for vehicular communication systems. Chaos-based modulation schemes have garnered considerable interest due to their benefits in physical layer security. However, most existing MU chaotic communication systems, particularly those based on non-coherent detection, suffer from low spectral efficiency due to reference signal transmission, and limited user connectivity under orthogonal multiple access (OMA). While non-orthogonal schemes, such as sparse code multiple access (SCMA)-based DCSK, have been explored, they face high computational complexity and inflexible scalability due to their fixed codebook designs. This paper proposes a deep learning-assisted power domain non-orthogonal multiple access chaos shift keying (DL-NOMA-CSK) system for vehicular communications. A deep neural network (DNN)-based demodulator is designed to learn intrinsic chaotic signal characteristics during offline training, thereby eliminating the need for chaotic synchronization or reference signal transmission. The demodulator employs a dual-domain feature extraction architecture that jointly processes the time-domain and frequency-domain information of chaotic signals, enhancing feature learning under dynamic channels. The DNN is integrated into the successive interference cancellation (SIC) framework to mitigate error propagation issues. Theoretical analysis and extensive simulations demonstrate that the proposed system achieves superior performance in terms of spectral efficiency (SE), energy efficiency (EE), bit error rate (BER), security, and robustness, while maintaining lower computational complexity compared to traditional MU-DCSK and existing DL-aided schemes. These advantages validate its practical viability for secure vehicular communications.