Shoujin Wang

IR
h-index71
33papers
2,576citations
Novelty38%
AI Score50

33 Papers

IRAug 23, 2023Code
LLMRec: Benchmarking Large Language Models on Recommendation Task

Junling Liu, Chao Liu, Peilin Zhou et al.

Recently, the fast development of Large Language Models (LLMs) such as ChatGPT has significantly advanced NLP tasks by enhancing the capabilities of conversational models. However, the application of LLMs in the recommendation domain has not been thoroughly investigated. To bridge this gap, we propose LLMRec, a LLM-based recommender system designed for benchmarking LLMs on various recommendation tasks. Specifically, we benchmark several popular off-the-shelf LLMs, such as ChatGPT, LLaMA, ChatGLM, on five recommendation tasks, including rating prediction, sequential recommendation, direct recommendation, explanation generation, and review summarization. Furthermore, we investigate the effectiveness of supervised finetuning to improve LLMs' instruction compliance ability. The benchmark results indicate that LLMs displayed only moderate proficiency in accuracy-based tasks such as sequential and direct recommendation. However, they demonstrated comparable performance to state-of-the-art methods in explainability-based tasks. We also conduct qualitative evaluations to further evaluate the quality of contents generated by different models, and the results show that LLMs can truly understand the provided information and generate clearer and more reasonable results. We aspire that this benchmark will serve as an inspiration for researchers to delve deeper into the potential of LLMs in enhancing recommendation performance. Our codes, processed data and benchmark results are available at https://github.com/williamliujl/LLMRec.

IRAug 10, 2022
Trustworthy Recommender Systems

Shoujin Wang, Xiuzhen Zhang, Yan Wang et al.

Recommender systems (RSs) aim to help users to effectively retrieve items of their interests from a large catalogue. For a quite long period of time, researchers and practitioners have been focusing on developing accurate RSs. Recent years have witnessed an increasing number of threats to RSs, coming from attacks, system and user generated noise, system bias. As a result, it has become clear that a strict focus on RS accuracy is limited and the research must consider other important factors, e.g., trustworthiness. For end users, a trustworthy RS (TRS) should not only be accurate, but also transparent, unbiased and fair as well as robust to noise or attacks. These observations actually led to a paradigm shift of the research on RSs: from accuracy-oriented RSs to TRSs. However, researchers lack a systematic overview and discussion of the literature in this novel and fast developing field of TRSs. To this end, in this paper, we provide an overview of TRSs, including a discussion of the motivation and basic concepts of TRSs, a presentation of the challenges in building TRSs, and a perspective on the future directions in this area. We also provide a novel conceptual framework to support the construction of TRSs.

CLApr 15, 2023Code
Medical Question Summarization with Entity-driven Contrastive Learning

Wenpeng Lu, Sibo Wei, Xueping Peng et al.

By summarizing longer consumer health questions into shorter and essential ones, medical question-answering systems can more accurately understand consumer intentions and retrieve suitable answers. However, medical question summarization is very challenging due to obvious distinctions in health trouble descriptions from patients and doctors. Although deep learning has been applied to successfully address the medical question summarization (MQS) task, two challenges remain: how to correctly capture question focus to model its semantic intention, and how to obtain reliable datasets to fairly evaluate performance. To address these challenges, this paper proposes a novel medical question summarization framework based on entity-driven contrastive learning (ECL). ECL employs medical entities present in frequently asked questions (FAQs) as focuses and devises an effective mechanism to generate hard negative samples. This approach compels models to focus on essential information and consequently generate more accurate question summaries. Furthermore, we have discovered that some MQS datasets, such as the iCliniq dataset with a 33% duplicate rate, have significant data leakage issues. To ensure an impartial evaluation of the related methods, this paper carefully examines leaked samples to reorganize more reasonable datasets. Extensive experiments demonstrate that our ECL method outperforms the existing methods and achieves new state-of-the-art performance, i.e., 52.85, 43.16, 41.31, 43.52 in terms of ROUGE-1 metric on MeQSum, CHQ-Summ, iCliniq, HealthCareMagic dataset, respectively. The code and datasets are available at https://github.com/yrbobo/MQS-ECL.

LGFeb 4, 2023
A Survey on Deep Learning based Time Series Analysis with Frequency Transformation

Kun Yi, Qi Zhang, Wei Fan et al.

Recently, frequency transformation (FT) has been increasingly incorporated into deep learning models to significantly enhance state-of-the-art accuracy and efficiency in time series analysis. The advantages of FT, such as high efficiency and a global view, have been rapidly explored and exploited in various time series tasks and applications, demonstrating the promising potential of FT as a new deep learning paradigm for time series analysis. Despite the growing attention and the proliferation of research in this emerging field, there is currently a lack of a systematic review and in-depth analysis of deep learning-based time series models with FT. It is also unclear why FT can enhance time series analysis and what its limitations are in the field. To address these gaps, we present a comprehensive review that systematically investigates and summarizes the recent research advancements in deep learning-based time series analysis with FT. Specifically, we explore the primary approaches used in current models that incorporate FT, the types of neural networks that leverage FT, and the representative FT-equipped models in deep time series analysis. We propose a novel taxonomy to categorize the existing methods in this field, providing a structured overview of the diverse approaches employed in incorporating FT into deep learning models for time series analysis. Finally, we highlight the advantages and limitations of FT for time series modeling and identify potential future research directions that can further contribute to the community of time series analysis.

IVJul 29, 2022
Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Image Segmentation

Shuchao Pang, Anan Du, Mehmet A. Orgun et al.

Automatic tumor or lesion segmentation is a crucial step in medical image analysis for computer-aided diagnosis. Although the existing methods based on Convolutional Neural Networks (CNNs) have achieved the state-of-the-art performance, many challenges still remain in medical tumor segmentation. This is because, although the human visual system can detect symmetries in 2D images effectively, regular CNNs can only exploit translation invariance, overlooking further inherent symmetries existing in medical images such as rotations and reflections. To solve this problem, we propose a novel group equivariant segmentation framework by encoding those inherent symmetries for learning more precise representations. First, kernel-based equivariant operations are devised on each orientation, which allows it to effectively address the gaps of learning symmetries in existing approaches. Then, to keep segmentation networks globally equivariant, we design distinctive group layers with layer-wise symmetry constraints. Finally, based on our novel framework, extensive experiments conducted on real-world clinical data demonstrate that a Group Equivariant Res-UNet (named GER-UNet) outperforms its regular CNN-based counterpart and the state-of-the-art segmentation methods in the tasks of hepatic tumor segmentation, COVID-19 lung infection segmentation and retinal vessel detection. More importantly, the newly built GER-UNet also shows potential in reducing the sample complexity and the redundancy of filters, upgrading current segmentation CNNs and delineating organs on other medical imaging modalities.

41.4IRApr 25Code
Structural and Disentangled Adaptation of Large Vision Language Models for Multimodal Recommendation

Zhongtao Rao, Peilin Zhou, Dading Chong et al.

Multimodal recommendation enhances accuracy by leveraging visual and textual signals, and its success largely depends on learning high-quality cross-modal representations. Recent advances in Large Vision-Language Models (LVLMs) offer unified multimodal representation learning, making them a promising backbone. However, applying LVLMs to recommendation remains challenging due to (i) representation misalignment, where domain gaps between item data and general pre-training lead to unaligned embedding spaces, and (ii) gradient conflicts during fine-tuning, where shared adapters cause interference and a lack of discriminative power. To address this, we propose SDA, a lightweight framework for Structural and Disentangled Adaptation, which integrates two components: Cross-Modal Structural Alignment (CMSA) and Modality-Disentangled Adaptation. CMSA aligns embeddings using intra-modal structures as a soft teacher, while MoDA mitigates gradient conflicts via expertized, gated low-rank paths to disentangle gradient flows. Experiments on three public Amazon datasets show SDA integrates seamlessly with existing multimodal and sequential recommenders, yielding average gains of 6.15% in Hit@10 and 8.64% in NDCG@10. It also achieves up to 12.83% and 18.70% gains on long-tail items with minimal inference overhead. Our code and full experimental results are available at https://github.com/RaoZhongtao/SDA.

LGJan 27, 2023
Learning Informative Representation for Fairness-aware Multivariate Time-series Forecasting: A Group-based Perspective

Hui He, Qi Zhang, Shoujin Wang et al.

Performance unfairness among variables widely exists in multivariate time series (MTS) forecasting models since such models may attend/bias to certain (advantaged) variables. Addressing this unfairness problem is important for equally attending to all variables and avoiding vulnerable model biases/risks. However, fair MTS forecasting is challenging and has been less studied in the literature. To bridge such significant gap, we formulate the fairness modeling problem as learning informative representations attending to both advantaged and disadvantaged variables. Accordingly, we propose a novel framework, named FairFor, for fairness-aware MTS forecasting. FairFor is based on adversarial learning to generate both group-independent and group-relevant representations for the downstream forecasting. The framework first leverages a spectral relaxation of the K-means objective to infer variable correlations and thus to group variables. Then, it utilizes a filtering&fusion component to filter the group-relevant information and generate group-independent representations via orthogonality regularization. The group-independent and group-relevant representations form highly informative representations, facilitating to sharing knowledge from advantaged variables to disadvantaged variables to guarantee fairness. Extensive experiments on four public datasets demonstrate the effectiveness of our proposed FairFor for fair forecasting and significant performance improvement.

LGNov 10, 2023
Frequency-domain MLPs are More Effective Learners in Time Series Forecasting

Kun Yi, Qi Zhang, Wei Fan et al.

Time series forecasting has played the key role in different industrial, including finance, traffic, energy, and healthcare domains. While existing literatures have designed many sophisticated architectures based on RNNs, GNNs, or Transformers, another kind of approaches based on multi-layer perceptrons (MLPs) are proposed with simple structure, low complexity, and {superior performance}. However, most MLP-based forecasting methods suffer from the point-wise mappings and information bottleneck, which largely hinders the forecasting performance. To overcome this problem, we explore a novel direction of applying MLPs in the frequency domain for time series forecasting. We investigate the learned patterns of frequency-domain MLPs and discover their two inherent characteristic benefiting forecasting, (i) global view: frequency spectrum makes MLPs own a complete view for signals and learn global dependencies more easily, and (ii) energy compaction: frequency-domain MLPs concentrate on smaller key part of frequency components with compact signal energy. Then, we propose FreTS, a simple yet effective architecture built upon Frequency-domain MLPs for Time Series forecasting. FreTS mainly involves two stages, (i) Domain Conversion, that transforms time-domain signals into complex numbers of frequency domain; (ii) Frequency Learning, that performs our redesigned MLPs for the learning of real and imaginary part of frequency components. The above stages operated on both inter-series and intra-series scales further contribute to channel-wise and time-wise dependency learning. Extensive experiments on 13 real-world benchmarks (including 7 benchmarks for short-term forecasting and 6 benchmarks for long-term forecasting) demonstrate our consistent superiority over state-of-the-art methods.

IRMay 22, 2022
Sequential/Session-based Recommendations: Challenges, Approaches, Applications and Opportunities

Shoujin Wang, Qi Zhang, Liang Hu et al.

In recent years, sequential recommender systems (SRSs) and session-based recommender systems (SBRSs) have emerged as a new paradigm of RSs to capture users' short-term but dynamic preferences for enabling more timely and accurate recommendations. Although SRSs and SBRSs have been extensively studied, there are many inconsistencies in this area caused by the diverse descriptions, settings, assumptions and application domains. There is no work to provide a unified framework and problem statement to remove the commonly existing and various inconsistencies in the area of SR/SBR. There is a lack of work to provide a comprehensive and systematic demonstration of the data characteristics, key challenges, most representative and state-of-the-art approaches, typical real-world applications and important future research directions in the area. This work aims to fill in these gaps so as to facilitate further research in this exciting and vibrant area.

SIApr 27, 2023
Rumor Detection with Hierarchical Representation on Bipartite Adhoc Event Trees

Qi Zhang, Yayi Yang, Chongyang Shi et al.

The rapid growth of social media has caused tremendous effects on information propagation, raising extreme challenges in detecting rumors. Existing rumor detection methods typically exploit the reposting propagation of a rumor candidate for detection by regarding all reposts to a rumor candidate as a temporal sequence and learning semantics representations of the repost sequence. However, extracting informative support from the topological structure of propagation and the influence of reposting authors for debunking rumors is crucial, which generally has not been well addressed by existing methods. In this paper, we organize a claim post in circulation as an adhoc event tree, extract event elements, and convert it to bipartite adhoc event trees in terms of both posts and authors, i.e., author tree and post tree. Accordingly, we propose a novel rumor detection model with hierarchical representation on the bipartite adhoc event trees called BAET. Specifically, we introduce word embedding and feature encoder for the author and post tree, respectively, and design a root-aware attention module to perform node representation. Then we adopt the tree-like RNN model to capture the structural correlations and propose a tree-aware attention module to learn tree representation for the author tree and post tree, respectively. Extensive experimental results on two public Twitter datasets demonstrate the effectiveness of BAET in exploring and exploiting the rumor propagation structure and the superior detection performance of BAET over state-of-the-art baseline methods.

CLSep 1, 2022
An Ion Exchange Mechanism Inspired Story Ending Generator for Different Characters

Xinyu Jiang, Qi Zhang, Chongyang Shi et al.

Story ending generation aims at generating reasonable endings for a given story context. Most existing studies in this area focus on generating coherent or diversified story endings, while they ignore that different characters may lead to different endings for a given story. In this paper, we propose a Character-oriented Story Ending Generator (CoSEG) to customize an ending for each character in a story. Specifically, we first propose a character modeling module to learn the personalities of characters from their descriptive experiences extracted from the story context. Then, inspired by the ion exchange mechanism in chemical reactions, we design a novel vector breaking/forming module to learn the intrinsic interactions between each character and the corresponding context through an analogical information exchange procedure. Finally, we leverage the attention mechanism to learn effective character-specific interactions and feed each interaction into a decoder to generate character-orient endings. Extensive experimental results and case studies demonstrate that CoSEG achieves significant improvements in the quality of generated endings compared with state-of-the-art methods, and it effectively customizes the endings for different characters.

IVOct 25, 2024Code
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction

Zixuan Gong, Guangyin Bao, Qi Zhang et al.

Reconstruction of static visual stimuli from non-invasion brain activity fMRI achieves great success, owning to advanced deep learning models such as CLIP and Stable Diffusion. However, the research on fMRI-to-video reconstruction remains limited since decoding the spatiotemporal perception of continuous visual experiences is formidably challenging. We contend that the key to addressing these challenges lies in accurately decoding both high-level semantics and low-level perception flows, as perceived by the brain in response to video stimuli. To the end, we propose NeuroClips, an innovative framework to decode high-fidelity and smooth video from fMRI. NeuroClips utilizes a semantics reconstructor to reconstruct video keyframes, guiding semantic accuracy and consistency, and employs a perception reconstructor to capture low-level perceptual details, ensuring video smoothness. During inference, it adopts a pre-trained T2V diffusion model injected with both keyframes and low-level perception flows for video reconstruction. Evaluated on a publicly available fMRI-video dataset, NeuroClips achieves smooth high-fidelity video reconstruction of up to 6s at 8FPS, gaining significant improvements over state-of-the-art models in various metrics, e.g., a 128% improvement in SSIM and an 81% improvement in spatiotemporal metrics. Our project is available at https://github.com/gongzix/NeuroClips.

LGJul 18, 2024
Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Hui He, Qi Zhang, Kun Yi et al.

The non-stationary nature of real-world Multivariate Time Series (MTS) data presents forecasting models with a formidable challenge of the time-variant distribution of time series, referred to as distribution shift. Existing studies on the distribution shift mostly adhere to adaptive normalization techniques for alleviating temporal mean and covariance shifts or time-variant modeling for capturing temporal shifts. Despite improving model generalization, these normalization-based methods often assume a time-invariant transition between outputs and inputs but disregard specific intra-/inter-series correlations, while time-variant models overlook the intrinsic causes of the distribution shift. This limits model expressiveness and interpretability of tackling the distribution shift for MTS forecasting. To mitigate such a dilemma, we present a unified Probabilistic Graphical Model to Jointly capturing intra-/inter-series correlations and modeling the time-variant transitional distribution, and instantiate a neural framework called JointPGM for non-stationary MTS forecasting. Specifically, JointPGM first employs multiple Fourier basis functions to learn dynamic time factors and designs two distinct learners: intra-series and inter-series learners. The intra-series learner effectively captures temporal dynamics by utilizing temporal gates, while the inter-series learner explicitly models spatial dynamics through multi-hop propagation, incorporating Gumbel-softmax sampling. These two types of series dynamics are subsequently fused into a latent variable, which is inversely employed to infer time factors, generate final prediction, and perform reconstruction. We validate the effectiveness and efficiency of JointPGM through extensive experiments on six highly non-stationary MTS datasets, achieving state-of-the-art forecasting performance of MTS forecasting.

CVJun 7, 2024Code
RU-AI: A Large Multimodal Dataset for Machine-Generated Content Detection

Liting Huang, Zhihao Zhang, Yiran Zhang et al.

The recent generative AI models' capability of creating realistic and human-like content is significantly transforming the ways in which people communicate, create and work. The machine-generated content is a double-edged sword. On one hand, it can benefit the society when used appropriately. On the other hand, it may mislead people, posing threats to the society, especially when mixed together with natural content created by humans. Hence, there is an urgent need to develop effective methods to detect machine-generated content. However, the lack of aligned multimodal datasets inhibited the development of such methods, particularly in triple-modality settings (e.g., text, image, and voice). In this paper, we introduce RU-AI, a new large-scale multimodal dataset for robust and effective detection of machine-generated content in text, image and voice. Our dataset is constructed on the basis of three large publicly available datasets: Flickr8K, COCO and Places205, by adding their corresponding AI duplicates, resulting in a total of 1,475,370 instances. In addition, we created an additional noise variant of the dataset for testing the robustness of detection models. We conducted extensive experiments with the current SOTA detection methods on our dataset. The results reveal that existing models still struggle to achieve accurate and robust detection on our dataset. We hope that this new data set can promote research in the field of machine-generated content detection, fostering the responsible use of generative AI. The source code and datasets are available at https://github.com/ZhihaoZhang97/RU-AI.

CLFeb 18, 2024
MSynFD: Multi-hop Syntax aware Fake News Detection

Liang Xiao, Qi Zhang, Chongyang Shi et al.

The proliferation of social media platforms has fueled the rapid dissemination of fake news, posing threats to our real-life society. Existing methods use multimodal data or contextual information to enhance the detection of fake news by analyzing news content and/or its social context. However, these methods often overlook essential textual news content (articles) and heavily rely on sequential modeling and global attention to extract semantic information. These existing methods fail to handle the complex, subtle twists in news articles, such as syntax-semantics mismatches and prior biases, leading to lower performance and potential failure when modalities or social context are missing. To bridge these significant gaps, we propose a novel multi-hop syntax aware fake news detection (MSynFD) method, which incorporates complementary syntax information to deal with subtle twists in fake news. Specifically, we introduce a syntactical dependency graph and design a multi-hop subgraph aggregation mechanism to capture multi-hop syntax. It extends the effect of word perception, leading to effective noise filtering and adjacent relation enhancement. Subsequently, a sequential relative position-aware Transformer is designed to capture the sequential information, together with an elaborate keyword debiasing module to mitigate the prior bias. Extensive experimental results on two public benchmark datasets verify the effectiveness and superior performance of our proposed MSynFD over state-of-the-art detection models.

IROct 30, 2024
Dual Contrastive Transformer for Hierarchical Preference Modeling in Sequential Recommendation

Chengkai Huang, Shoujin Wang, Xianzhi Wang et al.

Sequential recommender systems (SRSs) aim to predict the subsequent items which may interest users via comprehensively modeling users' complex preference embedded in the sequence of user-item interactions. However, most of existing SRSs often model users' single low-level preference based on item ID information while ignoring the high-level preference revealed by item attribute information, such as item category. Furthermore, they often utilize limited sequence context information to predict the next item while overlooking richer inter-item semantic relations. To this end, in this paper, we proposed a novel hierarchical preference modeling framework to substantially model the complex low- and high-level preference dynamics for accurate sequential recommendation. Specifically, in the framework, a novel dual-transformer module and a novel dual contrastive learning scheme have been designed to discriminatively learn users' low- and high-level preference and to effectively enhance both low- and high-level preference learning respectively. In addition, a novel semantics-enhanced context embedding module has been devised to generate more informative context embedding for further improving the recommendation performance. Extensive experiments on six real-world datasets have demonstrated both the superiority of our proposed method over the state-of-the-art ones and the rationality of our design.

IROct 29, 2024
Modeling Temporal Positive and Negative Excitation for Sequential Recommendation

Chengkai Huang, Shoujin Wang, Xianzhi Wang et al.

Sequential recommendation aims to predict the next item which interests users via modeling their interest in items over time. Most of the existing works on sequential recommendation model users' dynamic interest in specific items while overlooking users' static interest revealed by some static attribute information of items, e.g., category, or brand. Moreover, existing works often only consider the positive excitation of a user's historical interactions on his/her next choice on candidate items while ignoring the commonly existing negative excitation, resulting in insufficient modeling dynamic interest. The overlook of static interest and negative excitation will lead to incomplete interest modeling and thus impede the recommendation performance. To this end, in this paper, we propose modeling both static interest and negative excitation for dynamic interest to further improve the recommendation performance. Accordingly, we design a novel Static-Dynamic Interest Learning (SDIL) framework featured with a novel Temporal Positive and Negative Excitation Modeling (TPNE) module for accurate sequential recommendation. TPNE is specially designed for comprehensively modeling dynamic interest based on temporal positive and negative excitation learning. Extensive experiments on three real-world datasets show that SDIL can effectively capture both static and dynamic interest and outperforms state-of-the-art baselines.

CLMay 5, 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Miaomiao Ji, Yanqiu Wu, Zhibin Wu et al.

Reward design plays a pivotal role in aligning large language models (LLMs) with human values, serving as the bridge between feedback signals and model optimization. This survey provides a structured organization of reward modeling and addresses three key aspects: mathematical formulation, construction practices, and interaction with optimization paradigms. Building on this, it develops a macro-level taxonomy that characterizes reward mechanisms along complementary dimensions, thereby offering both conceptual clarity and practical guidance for alignment research. The progression of LLM alignment can be understood as a continuous refinement of reward design strategies, with recent developments highlighting paradigm shifts from reinforcement learning (RL)-based to RL-free optimization and from single-task to multi-objective and complex settings.

CLMar 12, 2025
A Survey on Enhancing Causal Reasoning Ability of Large Language Models

Xin Li, Zhuo Cai, Shoujin Wang et al.

Large language models (LLMs) have recently shown remarkable performance in language tasks and beyond. However, due to their limited inherent causal reasoning ability, LLMs still face challenges in handling tasks that require robust causal reasoning ability, such as health-care and economic analysis. As a result, a growing body of research has focused on enhancing the causal reasoning ability of LLMs. Despite the booming research, there lacks a survey to well review the challenges, progress and future directions in this area. To bridge this significant gap, we systematically review literature on how to strengthen LLMs' causal reasoning ability in this paper. We start from the introduction of background and motivations of this topic, followed by the summarisation of key challenges in this area. Thereafter, we propose a novel taxonomy to systematically categorise existing methods, together with detailed comparisons within and between classes of methods. Furthermore, we summarise existing benchmarks and evaluation metrics for assessing LLMs' causal reasoning ability. Finally, we outline future research directions for this emerging field, offering insights and inspiration to researchers and practitioners in the area.

SIFeb 20, 2025
A Macro- and Micro-Hierarchical Transfer Learning Framework for Cross-Domain Fake News Detection

Xuankai Yang, Yan Wang, Xiuzhen Zhang et al.

Cross-domain fake news detection aims to mitigate domain shift and improve detection performance by transferring knowledge across domains. Existing approaches transfer knowledge based on news content and user engagements from a source domain to a target domain. However, these approaches face two main limitations, hindering effective knowledge transfer and optimal fake news detection performance. Firstly, from a micro perspective, they neglect the negative impact of veracity-irrelevant features in news content when transferring domain-shared features across domains. Secondly, from a macro perspective, existing approaches ignore the relationship between user engagement and news content, which reveals shared behaviors of common users across domains and can facilitate more effective knowledge transfer. To address these limitations, we propose a novel macro- and micro- hierarchical transfer learning framework (MMHT) for cross-domain fake news detection. Firstly, we propose a micro-hierarchical disentangling module to disentangle veracity-relevant and veracity-irrelevant features from news content in the source domain for improving fake news detection performance in the target domain. Secondly, we propose a macro-hierarchical transfer learning module to generate engagement features based on common users' shared behaviors in different domains for improving effectiveness of knowledge transfer. Extensive experiments on real-world datasets demonstrate that our framework significantly outperforms the state-of-the-art baselines.

39.0LGApr 1
Neural Federated Learning for Livestock Growth Prediction

Shoujin Wang, Mingze Ni, Wei Liu et al.

Livestock growth prediction is essential for optimising farm management and improving the efficiency and sustainability of livestock production, yet it remains underexplored due to limited large-scale datasets and privacy concerns surrounding farm-level data. Existing biophysical models rely on fixed formulations, while most machine learning approaches are trained on small, isolated datasets, limiting their robustness and generalisability. To address these challenges, we propose LivestockFL, the first federated learning framework specifically designed for livestock growth prediction. LivestockFL enables collaborative model training across distributed farms without sharing raw data, thereby preserving data privacy while alleviating data sparsity, particularly for farms with limited historical records. The framework employs a neural architecture based on a Gated Recurrent Unit combined with a multilayer perceptron to model temporal growth patterns from historical weight records and auxiliary features. We further introduce LivestockPFL, a novel personalised federated learning framework that extends the above federated learning framework with a personalized prediction head trained on each farm's local data, producing farm-specific predictors. Experiments on a real-world dataset demonstrate the effectiveness and practicality of the proposed approaches.

LGSep 22, 2025
Revealing Multimodal Causality with Large Language Models

Jin Li, Shoujin Wang, Qi Zhang et al.

Uncovering cause-and-effect mechanisms from data is fundamental to scientific progress. While large language models (LLMs) show promise for enhancing causal discovery (CD) from unstructured data, their application to the increasingly prevalent multimodal setting remains a critical challenge. Even with the advent of multimodal LLMs (MLLMs), their efficacy in multimodal CD is hindered by two primary limitations: (1) difficulty in exploring intra- and inter-modal interactions for comprehensive causal variable identification; and (2) insufficiency to handle structural ambiguities with purely observational data. To address these challenges, we propose MLLM-CD, a novel framework for multimodal causal discovery from unstructured data. It consists of three key components: (1) a novel contrastive factor discovery module to identify genuine multimodal factors based on the interactions explored from contrastive sample pairs; (2) a statistical causal structure discovery module to infer causal relationships among discovered factors; and (3) an iterative multimodal counterfactual reasoning module to refine the discovery outcomes iteratively by incorporating the world knowledge and reasoning capabilities of MLLMs. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed MLLM-CD in revealing genuine factors and causal relationships among them from multimodal unstructured data.

CLMay 23, 2023
Causal Intervention for Abstractive Related Work Generation

Jiachang Liu, Qi Zhang, Chongyang Shi et al.

Abstractive related work generation has attracted increasing attention in generating coherent related work that better helps readers grasp the background in the current research. However, most existing abstractive models ignore the inherent causality of related work generation, leading to low quality of generated related work and spurious correlations that affect the models' generalizability. In this study, we argue that causal intervention can address these limitations and improve the quality and coherence of the generated related works. To this end, we propose a novel Causal Intervention Module for Related Work Generation (CaM) to effectively capture causalities in the generation process and improve the quality and coherence of the generated related works. Specifically, we first model the relations among sentence order, document relation, and transitional content in related work generation using a causal graph. Then, to implement the causal intervention and mitigate the negative impact of spurious correlations, we use do-calculus to derive ordinary conditional probabilities and identify causal effects through CaM. Finally, we subtly fuse CaM with Transformer to obtain an end-to-end generation model. Extensive experiments on two real-world datasets show that causal interventions in CaM can effectively promote the model to learn causal relations and produce related work of higher quality and coherence.

IROct 12, 2021
Aspect-driven User Preference and News Representation Learning for News Recommendation

Rongyao Wang, Wenpeng Lu, Shoujin Wang et al.

News recommender systems are essential for helping users to efficiently and effectively find out those interesting news from a large amount of news. Most of existing news recommender systems usually learn topic-level representations of users and news for recommendation, and neglect to learn more informative aspect-level features of users and news for more accurate recommendation. As a result, they achieve limited recommendation performance. Aiming at addressing this deficiency, we propose a novel Aspect-driven News Recommender System (ANRS) built on aspect-level user preference and news representation learning. Here, news aspect is fine-grained semantic information expressed by a set of related words, which indicates specific aspects described by the news. In ANRS, news aspect-level encoder and user aspect-level encoder are devised to learn the fine-grained aspect-level representations of user's preferences and news characteristics respectively, which are fed into click predictor to judge the probability of the user clicking the candidate news. Extensive experiments are done on the commonly used real-world dataset MIND, which demonstrate the superiority of our method compared with representative and state-of-the-art methods.

IRJul 15, 2021
Next-item Recommendations in Short Sessions

Wenzhuo Song, Shoujin Wang, Yan Wang et al.

The changing preferences of users towards items trigger the emergence of session-based recommender systems (SBRSs), which aim to model the dynamic preferences of users for next-item recommendations. However, most of the existing studies on SBRSs are based on long sessions only for recommendations, ignoring short sessions, though short sessions, in fact, account for a large proportion in most of the real-world datasets. As a result, the applicability of existing SBRSs solutions is greatly reduced. In a short session, quite limited contextual information is available, making the next-item recommendation very challenging. To this end, in this paper, inspired by the success of few-shot learning (FSL) in effectively learning a model with limited instances, we formulate the next-item recommendation as an FSL problem. Accordingly, following the basic idea of a representative approach for FSL, i.e., meta-learning, we devise an effective SBRS called INter-SEssion collaborative Recommender netTwork (INSERT) for next-item recommendations in short sessions. With the carefully devised local module and global module, INSERT is able to learn an optimal preference representation of the current user in a given short session. In particular, in the global module, a similar session retrieval network (SSRN) is designed to find out the sessions similar to the current short session from the historical sessions of both the current user and other users, respectively. The obtained similar sessions are then utilized to complement and optimize the preference representation learned from the current short session by the local module for more accurate next-item recommendations in this short session. Extensive experiments conducted on two real-world datasets demonstrate the superiority of our proposed INSERT over the state-of-the-art SBRSs when making next-item recommendations in short sessions.

IRMay 13, 2021
Graph Learning based Recommender Systems: A Review

Shoujin Wang, Liang Hu, Yan Wang et al.

Recent years have witnessed the fast development of the emerging topic of Graph Learning based Recommender Systems (GLRS). GLRS employ advanced graph learning approaches to model users' preferences and intentions as well as items' characteristics for recommendations. Differently from other RS approaches, including content-based filtering and collaborative filtering, GLRS are built on graphs where the important objects, e.g., users, items, and attributes, are either explicitly or implicitly connected. With the rapid development of graph learning techniques, exploring and exploiting homogeneous or heterogeneous relations in graphs are a promising direction for building more effective RS. In this paper, we provide a systematic review of GLRS, by discussing how they extract important knowledge from graph-based representations to improve the accuracy, reliability and explainability of the recommendations. First, we characterize and formalize GLRS, and then summarize and categorize the key challenges and main progress in this novel research area. Finally, we share some new research directions in this vibrant area.

IRSep 15, 2020
Stratified and Time-aware Sampling based Adaptive Ensemble Learning for Streaming Recommendations

Yan Zhao, Shoujin Wang, Yan Wang et al.

Recommender systems have played an increasingly important role in providing users with tailored suggestions based on their preferences. However, the conventional offline recommender systems cannot handle the ubiquitous data stream well. To address this issue, Streaming Recommender Systems (SRSs) have emerged in recent years, which incrementally train recommendation models on newly received data for effective real-time recommendations. Focusing on new data only benefits addressing concept drift, i.e., the changing user preferences towards items. However, it impedes capturing long-term user preferences. In addition, the commonly existing underload and overload problems should be well tackled for higher accuracy of streaming recommendations. To address these problems, we propose a Stratified and Time-aware Sampling based Adaptive Ensemble Learning framework, called STS-AEL, to improve the accuracy of streaming recommendations. In STS-AEL, we first devise stratified and time-aware sampling to extract representative data from both new data and historical data to address concept drift while capturing long-term user preferences. Also, incorporating the historical data benefits utilizing the idle resources in the underload scenario more effectively. After that, we propose adaptive ensemble learning to efficiently process the overloaded data in parallel with multiple individual recommendation models, and then effectively fuse the results of these models with a sequential adaptive mechanism. Extensive experiments conducted on three real-world datasets demonstrate that STS-AEL, in all the cases, significantly outperforms the state-of-the-art SRSs.

IRSep 14, 2020
Double-Wing Mixture of Experts for Streaming Recommendations

Yan Zhao, Shoujin Wang, Yan Wang et al.

Streaming Recommender Systems (SRSs) commonly train recommendation models on newly received data only to address user preference drift, i.e., the changing user preferences towards items. However, this practice overlooks the long-term user preferences embedded in historical data. More importantly, the common heterogeneity in data stream greatly reduces the accuracy of streaming recommendations. The reason is that different preferences (or characteristics) of different types of users (or items) cannot be well learned by a unified model. To address these two issues, we propose a Variational and Reservoir-enhanced Sampling based Double-Wing Mixture of Experts framework, called VRS-DWMoE, to improve the accuracy of streaming recommendations. In VRS-DWMoE, we first devise variational and reservoir-enhanced sampling to wisely complement new data with historical data, and thus address the user preference drift issue while capturing long-term user preferences. After that, we propose a Double-Wing Mixture of Experts (DWMoE) model to first effectively learn heterogeneous user preferences and item characteristics, and then make recommendations based on them. Specifically, DWMoE contains two Mixture of Experts (MoE, an effective ensemble learning model) to learn user preferences and item characteristics, respectively. Moreover, the multiple experts in each MoE learn the preferences (or characteristics) of different types of users (or items) where each expert specializes in one underlying type. Extensive experiments demonstrate that VRS-DWMoE consistently outperforms the state-of-the-art SRSs.

IRMay 30, 2020
Jointly Modeling Intra- and Inter-transaction Dependencies with Hierarchical Attentive Transaction Embeddings for Next-item Recommendation

Shoujin Wang, Longbing Cao, Liang Hu et al.

A transaction-based recommender system (TBRS) aims to predict the next item by modeling dependencies in transactional data. Generally, two kinds of dependencies considered are intra-transaction dependency and inter-transaction dependency. Most existing TBRSs recommend next item by only modeling the intra-transaction dependency within the current transaction while ignoring inter-transaction dependency with recent transactions that may also affect the next item. However, as not all recent transactions are relevant to the current and next items, the relevant ones should be identified and prioritized. In this paper, we propose a novel hierarchical attentive transaction embedding (HATE) model to tackle these issues. Specifically, a two-level attention mechanism integrates both item embedding and transaction embedding to build an attentive context representation that incorporates both intraand inter-transaction dependencies. With the learned context representation, HATE then recommends the next item. Experimental evaluations on two real-world transaction datasets show that HATE significantly outperforms the state-ofthe-art methods in terms of recommendation accuracy.

IVMay 8, 2020
Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Images for Segmentation

Shuchao Pang, Anan Du, Mehmet A. Orgun et al.

Automatic tumor segmentation is a crucial step in medical image analysis for computer-aided diagnosis. Although the existing methods based on convolutional neural networks (CNNs) have achieved the state-of-the-art performance, many challenges still remain in medical tumor segmentation. This is because regular CNNs can only exploit translation invariance, ignoring further inherent symmetries existing in medical images such as rotations and reflections. To mitigate this shortcoming, we propose a novel group equivariant segmentation framework by encoding those inherent symmetries for learning more precise representations. First, kernel-based equivariant operations are devised on every orientation, which can effectively address the gaps of learning symmetries in existing approaches. Then, to keep segmentation networks globally equivariant, we design distinctive group layers with layerwise symmetry constraints. By exploiting further symmetries, novel segmentation CNNs can dramatically reduce the sample complexity and the redundancy of filters (by roughly 2/3) over regular CNNs. More importantly, based on our novel framework, we show that a newly built GER-UNet outperforms its regular CNN-based counterpart and the state-of-the-art segmentation methods on real-world clinical data. Specifically, the group layers of our segmentation framework can be seamlessly integrated into any popular CNN-based segmentation architectures.

IRApr 22, 2020
Graph Learning Approaches to Recommender Systems: A Review

Shoujin Wang, Liang Hu, Yan Wang et al.

Recent years have witnessed the fast development of the emerging topic of Graph Learning based Recommender Systems (GLRS). GLRS mainly employ the advanced graph learning approaches to model users' preferences and intentions as well as items' characteristics and popularity for Recommender Systems (RS). Differently from conventional RS, including content based filtering and collaborative filtering, GLRS are built on simple or complex graphs where various objects, e.g., users, items, and attributes, are explicitly or implicitly connected. With the rapid development of graph learning, exploring and exploiting homogeneous or heterogeneous relations in graphs is a promising direction for building advanced RS. In this paper, we provide a systematic review of GLRS, on how they obtain the knowledge from graphs to improve the accuracy, reliability and explainability for recommendations. First, we characterize and formalize GLRS, and then summarize and categorize the key challenges in this new research area. Then, we survey the most recent and important developments in the area. Finally, we share some new research directions in this vibrant area.

IRDec 28, 2019
Sequential Recommender Systems: Challenges, Progress and Prospects

Shoujin Wang, Liang Hu, Yan Wang et al.

The emerging topic of sequential recommender systems has attracted increasing attention in recent years.Different from the conventional recommender systems including collaborative filtering and content-based filtering, SRSs try to understand and model the sequential user behaviors, the interactions between users and items, and the evolution of users preferences and item popularity over time. SRSs involve the above aspects for more precise characterization of user contexts, intent and goals, and item consumption trend, leading to more accurate, customized and dynamic recommendations.In this paper, we provide a systematic review on SRSs.We first present the characteristics of SRSs, and then summarize and categorize the key challenges in this research area, followed by the corresponding research progress consisting of the most recent and representative developments on this topic.Finally, we discuss the important research directions in this vibrant area.

IRFeb 13, 2019
A Survey on Session-based Recommender Systems

Shoujin Wang, Longbing Cao, Yan Wang et al.

Recommender systems (RSs) have been playing an increasingly important role for informed consumption, services, and decision-making in the overloaded information era and digitized economy. In recent years, session-based recommender systems (SBRSs) have emerged as a new paradigm of RSs. Different from other RSs such as content-based RSs and collaborative filtering-based RSs which usually model long-term yet static user preferences, SBRSs aim to capture short-term but dynamic user preferences to provide more timely and accurate recommendations sensitive to the evolution of their session contexts. Although SBRSs have been intensively studied, neither unified problem statements for SBRSs nor in-depth elaboration of SBRS characteristics and challenges are available. It is also unclear to what extent SBRS challenges have been addressed and what the overall research landscape of SBRSs is. This comprehensive review of SBRSs addresses the above aspects by exploring in depth the SBRS entities (e.g., sessions), behaviours (e.g., users' clicks on items) and their properties (e.g., session length). We propose a general problem statement of SBRSs, summarize the diversified data characteristics and challenges of SBRSs, and define a taxonomy to categorize the representative SBRS research. Finally, we discuss new research opportunities in this exciting and vibrant area.