AIJul 10, 2023Code
Exploring Large Language Model for Graph Data Understanding in Online Job RecommendationsLikang Wu, Zhaopeng Qiu, Zhi Zheng et al.
Large Language Models (LLMs) have revolutionized natural language processing tasks, demonstrating their exceptional capabilities in various domains. However, their potential for behavior graph understanding in job recommendations remains largely unexplored. This paper focuses on unveiling the capability of large language models in understanding behavior graphs and leveraging this understanding to enhance recommendations in online recruitment, including the promotion of out-of-distribution (OOD) application. We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs and uncover underlying patterns and relationships. Specifically, we propose a meta-path prompt constructor that leverages LLM recommender to understand behavior graphs for the first time and design a corresponding path augmentation module to alleviate the prompt bias introduced by path-based sequence input. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users. We evaluate the effectiveness of our approach on a comprehensive dataset and demonstrate its ability to improve the relevance and quality of recommended quality. This research not only sheds light on the untapped potential of large language models but also provides valuable insights for developing advanced recommendation systems in the recruitment market. The findings contribute to the growing field of natural language processing and offer practical implications for enhancing job search experiences. We release the code at https://github.com/WLiK/GLRec.
IVNov 12, 2022Code
DeltaNet:Conditional Medical Report Generation for COVID-19 DiagnosisXian Wu, Shuxin Yang, Zhaopeng Qiu et al.
Fast screening and diagnosis are critical in COVID-19 patient treatment. In addition to the gold standard RT-PCR, radiological imaging like X-ray and CT also works as an important means in patient screening and follow-up. However, due to the excessive number of patients, writing reports becomes a heavy burden for radiologists. To reduce the workload of radiologists, we propose DeltaNet to generate medical reports automatically. Different from typical image captioning approaches that generate reports with an encoder and a decoder, DeltaNet applies a conditional generation process. In particular, given a medical image, DeltaNet employs three steps to generate a report: 1) first retrieving related medical reports, i.e., the historical reports from the same or similar patients; 2) then comparing retrieved images and current image to find the differences; 3) finally generating a new report to accommodate identified differences based on the conditional report. We evaluate DeltaNet on a COVID-19 dataset, where DeltaNet outperforms state-of-the-art approaches. Besides COVID-19, the proposed DeltaNet can be applied to other diseases as well. We validate its generalization capabilities on the public IU-Xray and MIMIC-CXR datasets for chest-related diseases. Code is available at \url{https://github.com/LX-doctorAI1/DeltaNet}.
IRJul 5, 2023
Generative Job Recommendations with Large Language ModelZhi Zheng, Zhaopeng Qiu, Xiao Hu et al.
The rapid development of online recruitment services has encouraged the utilization of recommender systems to streamline the job seeking process. Predominantly, current job recommendations deploy either collaborative filtering or person-job matching strategies. However, these models tend to operate as "black-box" systems and lack the capacity to offer explainable guidance to job seekers. Moreover, conventional matching-based recommendation methods are limited to retrieving and ranking existing jobs in the database, restricting their potential as comprehensive career AI advisors. To this end, here we present GIRL (GeneratIve job Recommendation based on Large language models), a novel approach inspired by recent advancements in the field of Large Language Models (LLMs). We initially employ a Supervised Fine-Tuning (SFT) strategy to instruct the LLM-based generator in crafting suitable Job Descriptions (JDs) based on the Curriculum Vitae (CV) of a job seeker. Moreover, we propose to train a model which can evaluate the matching degree between CVs and JDs as a reward model, and we use Proximal Policy Optimization (PPO)-based Reinforcement Learning (RL) method to further fine-tine the generator. This aligns the generator with recruiter feedback, tailoring the output to better meet employer preferences. In particular, GIRL serves as a job seeker-centric generative model, providing job suggestions without the need of a candidate set. This capability also enhances the performance of existing job recommendation models by supplementing job seeking features with generated content. With extensive experiments on a large-scale real-world dataset, we demonstrate the substantial effectiveness of our approach. We believe that GIRL introduces a paradigm-shifting approach to job recommendation systems, fostering a more personalized and comprehensive job-seeking experience.
IRApr 9, 2022
Denoising Neural Network for News Recommendation with Positive and Negative Implicit FeedbackYunfan Hu, Zhaopeng Qiu, Xian Wu
News recommendation is different from movie or e-commercial recommendation as people usually do not grade the news. Therefore, user feedback for news is always implicit (click behavior, reading time, etc). Inevitably, there are noises in implicit feedback. On one hand, the user may exit immediately after clicking the news as he dislikes the news content, leaving the noise in his positive implicit feedback; on the other hand, the user may be recommended multiple interesting news at the same time and only click one of them, producing the noise in his negative implicit feedback. Opposite implicit feedback could construct more integrated user preferences and help each other to minimize the noise influence. Previous works on news recommendation only used positive implicit feedback and suffered from the noise impact. In this paper, we propose a denoising neural network for news recommendation with positive and negative implicit feedback, named DRPN. DRPN utilizes both feedback for recommendation with a module to denoise both positive and negative implicit feedback to further enhance the performance. Experiments on the real-world large-scale dataset demonstrate the state-of-the-art performance of DRPN.
IRMay 31, 2023Code
A Survey on Large Language Models for RecommendationLikang Wu, Zhi Zheng, Zhaopeng Qiu et al.
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS). These models, trained on massive amounts of data using self-supervised learning, have demonstrated remarkable success in learning universal representations and have the potential to enhance various aspects of recommendation systems by some effective transfer techniques such as fine-tuning and prompt tuning, and so on. The crucial aspect of harnessing the power of language models in enhancing recommendation quality is the utilization of their high-quality representations of textual features and their extensive coverage of external knowledge to establish correlations between items and users. To provide a comprehensive understanding of the existing LLM-based recommendation systems, this survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec), with the latter being systematically sorted out for the first time. Furthermore, we systematically review and analyze existing LLM-based recommendation systems within each paradigm, providing insights into their methodologies, techniques, and performance. Additionally, we identify key challenges and several valuable findings to provide researchers and practitioners with inspiration. We have also created a GitHub repository to index relevant papers on LLMs for recommendation, https://github.com/WLiK/LLM4Rec.
LGJan 31, 2024
A Cross-View Hierarchical Graph Learning Hypernetwork for Skill Demand-Supply Joint PredictionWenshuo Chao, Zhaopeng Qiu, Likang Wu et al.
The rapidly changing landscape of technology and industries leads to dynamic skill requirements, making it crucial for employees and employers to anticipate such shifts to maintain a competitive edge in the labor market. Existing efforts in this area either rely on domain-expert knowledge or regarding skill evolution as a simplified time series forecasting problem. However, both approaches overlook the sophisticated relationships among different skills and the inner-connection between skill demand and supply variations. In this paper, we propose a Cross-view Hierarchical Graph learning Hypernetwork (CHGH) framework for joint skill demand-supply prediction. Specifically, CHGH is an encoder-decoder network consisting of i) a cross-view graph encoder to capture the interconnection between skill demand and supply, ii) a hierarchical graph encoder to model the co-evolution of skills from a cluster-wise perspective, and iii) a conditional hyper-decoder to jointly predict demand and supply variations by incorporating historical demand-supply gaps. Extensive experiments on three real-world datasets demonstrate the superiority of the proposed framework compared to seven baselines and the effectiveness of the three modules.
LGJan 26
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement LearningZhaopeng Qiu, Shuang Yu, Jingqi Zhang et al.
Reinforcement learning (RL) for large language models (LLMs) is increasingly bottlenecked by rollout (generation), where long output sequence lengths make attention and KV-cache memory dominate end-to-end step time. FP8 offers an attractive lever for accelerating RL by reducing compute cost and memory traffic during rollout, but applying FP8 in RL introduces unique engineering and algorithmic challenges: policy weights change every step (requiring repeated quantization and weight synchronization into the inference engine) and low-precision rollouts can deviate from the higher-precision policy assumed by the trainer, causing train-inference mismatch and potential instability. This report presents a practical FP8 rollout stack for LLM RL, implemented in the veRL ecosystem with support for common training backends (e.g., FSDP/Megatron-LM) and inference engines (e.g., vLLM/SGLang). We (i) enable FP8 W8A8 linear-layer rollout using blockwise FP8 quantization, (ii) extend FP8 to KV-cache to remove long-context memory bottlenecks via per-step QKV scale recalibration, and (iii) mitigate mismatch using importance-sampling-based rollout correction (token-level TIS/MIS variants). Across dense and MoE models, these techniques deliver up to 44% rollout throughput gains while preserving learning behavior comparable to BF16 baselines.
CLMay 17, 2023
Large Language Models Leverage External Knowledge to Extend Clinical Insight Beyond Language BoundariesJiageng Wu, Xian Wu, Zhaopeng Qiu et al.
$\textbf{Objectives}$: Large Language Models (LLMs) such as ChatGPT and Med-PaLM have excelled in various medical question-answering tasks. However, these English-centric models encounter challenges in non-English clinical settings, primarily due to limited clinical knowledge in respective languages, a consequence of imbalanced training corpora. We systematically evaluate LLMs in the Chinese medical context and develop a novel in-context learning framework to enhance their performance. $\textbf{Materials and Methods}$: The latest China National Medical Licensing Examination (CNMLE-2022) served as the benchmark. We collected 53 medical books and 381,149 medical questions to construct the medical knowledge base and question bank. The proposed Knowledge and Few-shot Enhancement In-context Learning (KFE) framework leverages the in-context learning ability of LLMs to integrate diverse external clinical knowledge sources. We evaluated KFE with ChatGPT(GPT3.5), GPT4, Baichuan2(BC2)-7B, and BC2-13B in CNMLE-2022 and investigated the effectiveness of different pathways for incorporating LLMs with medical knowledge from 7 perspectives. $\textbf{Results}$: Directly applying ChatGPT failed to qualify for the CNMLE-2022 at a score of 51. Cooperated with the KFE, the LLMs with varying sizes yielded consistent and significant improvements. The ChatGPT's performance surged to 70.04 and GPT-4 achieved the highest score of 82.59. This surpasses the qualification threshold (60) and exceeds the average human score of 68.70. It also enabled a smaller BC2-13B to pass the examination, showcasing the great potential in low-resource settings. $\textbf{Conclusion}$: By synergizing medical knowledge through in-context learning, LLM can extend clinical insight beyond language barriers, significantly reducing language-related disparities of LLM applications and ensuring global benefit in healthcare.
LGFeb 14, 2022
Conditional Generation Net for Medication RecommendationRui Wu, Zhaopeng Qiu, Jiacheng Jiang et al.
Medication recommendation targets to provide a proper set of medicines according to patients' diagnoses, which is a critical task in clinics. Currently, the recommendation is manually conducted by doctors. However, for complicated cases, like patients with multiple diseases at the same time, it's difficult to propose a considerate recommendation even for experienced doctors. This urges the emergence of automatic medication recommendation which can help treat the diagnosed diseases without causing harmful drug-drug interactions.Due to the clinical value, medication recommendation has attracted growing research interests.Existing works mainly formulate medication recommendation as a multi-label classification task to predict the set of medicines. In this paper, we propose the Conditional Generation Net (COGNet) which introduces a novel copy-or-predict mechanism to generate the set of medicines. Given a patient, the proposed model first retrieves his or her historical diagnoses and medication recommendations and mines their relationship with current diagnoses. Then in predicting each medicine, the proposed model decides whether to copy a medicine from previous recommendations or to predict a new one. This process is quite similar to the decision process of human doctors. We validate the proposed model on the public MIMIC data set, and the experimental results show that the proposed model can outperform state-of-the-art approaches.
CLNov 26, 2020
Automatic Distractor Generation for Multiple Choice Questions in Standard TestsZhaopeng Qiu, Xian Wu, Wei Fan
To assess the knowledge proficiency of a learner, multiple choice question is an efficient and widespread form in standard tests. However, the composition of the multiple choice question, especially the construction of distractors is quite challenging. The distractors are required to both incorrect and plausible enough to confuse the learners who did not master the knowledge. Currently, the distractors are generated by domain experts which are both expensive and time-consuming. This urges the emergence of automatic distractor generation, which can benefit various standard tests in a wide range of domains. In this paper, we propose a question and answer guided distractor generation (EDGE) framework to automate distractor generation. EDGE consists of three major modules: (1) the Reforming Question Module and the Reforming Passage Module apply gate layers to guarantee the inherent incorrectness of the generated distractors; (2) the Distractor Generator Module applies attention mechanism to control the level of plausibility. Experimental results on a large-scale public dataset demonstrate that our model significantly outperforms existing models and achieves a new state-of-the-art.
SIMay 22, 2018
Social-Network-Assisted Worker Recruitment in Mobile Crowd SensingJiangtao Wang, Feng Wang, Yasha Wang et al.
Worker recruitment is a crucial research problem in Mobile Crowd Sensing (MCS). While previous studies rely on a specified platform with a pre-assumed large user pool, this paper leverages the influenced propagation on the social network to assist the MCS worker recruitment. We first select a subset of users on the social network as initial seeds and push MCS tasks to them. Then, influenced users who accept tasks are recruited as workers, and the ultimate goal is to maximize the coverage. Specifically, to select a near-optimal set of seeds, we propose two algorithms, named Basic-Selector and Fast-Selector, respectively. Basic-Selector adopts an iterative greedy process based on the predicted mobility, which has good performance but suffers from inefficiency concerns. To accelerate the selection, Fast-Selector is proposed, which is based on the interdependency of geographical positions among friends. Empirical studies on two real-world datasets verify that Fast-Selector achieves higher coverage than baseline methods under various settings, meanwhile, it is much more efficient than Basic-Selector while only sacrificing a slight fraction of the coverage.
HCMay 22, 2018
HyTasker: Hybrid Task Allocation in Mobile Crowd SensingJiangtao Wang, Feng Wang, Yasha Wang et al.
Task allocation is a major challenge in Mobile Crowd Sensing (MCS). While previous task allocation approaches follow either the opportunistic or participatory mode, this paper proposes to integrate these two complementary modes in a two-phased hybrid framework called HyTasker. In the offline phase, a group of workers (called opportunistic workers) are selected, and they complete MCS tasks during their daily routines (i.e., opportunistic mode). In the online phase, we assign another set of workers (called participatory workers) and require them to move specifically to perform tasks that are not completed by the opportunistic workers (i.e., participatory mode). Instead of considering these two phases separately, HyTasker jointly optimizes them with a total incentive budget constraint. In particular, when selecting opportunistic workers in the offline phase of HyTasker, we propose a novel algorithm that simultaneously considers the predicted task assignment for the participatory workers, in which the density and mobility of participatory workers are taken into account. Experiments on a real-world mobility dataset demonstrate that HyTasker outperforms other methods with more completed tasks under the same budget constraint.