CLJun 2, 2023
Supervised Adversarial Contrastive Learning for Emotion Recognition in ConversationsDou Hu, Yinan Bao, Lingwei Wei et al.
Extracting generalized and robust representations is a major challenge in emotion recognition in conversations (ERC). To address this, we propose a supervised adversarial contrastive learning (SACL) framework for learning class-spread structured representations in a supervised manner. SACL applies contrast-aware adversarial training to generate worst-case samples and uses joint class-spread contrastive learning to extract structured representations. It can effectively utilize label-level feature consistency and retain fine-grained intra-class features. To avoid the negative impact of adversarial perturbations on context-dependent data, we design a contextual adversarial training (CAT) strategy to learn more diverse features from context and enhance the model's context robustness. Under the framework with CAT, we develop a sequence-based SACL-LSTM to learn label-consistent and context-robust features for ERC. Experiments on three datasets show that SACL-LSTM achieves state-of-the-art performance on ERC. Extended experiments prove the effectiveness of SACL and CAT.
CLJun 7, 2022
Speaker-Guided Encoder-Decoder Framework for Emotion Recognition in ConversationYinan Bao, Qianwen Ma, Lingwei Wei et al.
The emotion recognition in conversation (ERC) task aims to predict the emotion label of an utterance in a conversation. Since the dependencies between speakers are complex and dynamic, which consist of intra- and inter-speaker dependencies, the modeling of speaker-specific information is a vital role in ERC. Although existing researchers have proposed various methods of speaker interaction modeling, they cannot explore dynamic intra- and inter-speaker dependencies jointly, leading to the insufficient comprehension of context and further hindering emotion prediction. To this end, we design a novel speaker modeling scheme that explores intra- and inter-speaker dependencies jointly in a dynamic manner. Besides, we propose a Speaker-Guided Encoder-Decoder (SGED) framework for ERC, which fully exploits speaker information for the decoding of emotion. We use different existing methods as the conversational context encoder of our framework, showing the high scalability and flexibility of the proposed framework. Experimental results demonstrate the superiority and effectiveness of SGED.
CLMay 4, 2022
Multi-Granularity Semantic Aware Graph Model for Reducing Position Bias in Emotion-Cause Pair ExtractionYinan Bao, Qianwen Ma, Lingwei Wei et al.
The Emotion-Cause Pair Extraction (ECPE) task aims to extract emotions and causes as pairs from documents. We observe that the relative distance distribution of emotions and causes is extremely imbalanced in the typical ECPE dataset. Existing methods have set a fixed size window to capture relations between neighboring clauses. However, they neglect the effective semantic connections between distant clauses, leading to poor generalization ability towards position-insensitive data. To alleviate the problem, we propose a novel Multi-Granularity Semantic Aware Graph model (MGSAG) to incorporate fine-grained and coarse-grained semantic features jointly, without regard to distance limitation. In particular, we first explore semantic dependencies between clauses and keywords extracted from the document that convey fine-grained semantic features, obtaining keywords enhanced clause representations. Besides, a clause graph is also established to model coarse-grained semantic relations between clauses. Experimental results indicate that MGSAG surpasses the existing state-of-the-art ECPE models. Especially, MGSAG outperforms other models significantly in the condition of position-insensitive data.
LGMay 21Code
ARC-STAR: Auditable Post-Hoc Correction for PDE Foundation ModelsChengze Li, Lingwei Wei, Li Sun et al.
Partial differential equation (PDE) foundation models are pretrained networks that forecast how physical fields like velocity and pressure evolve from a single reusable solver. On unfamiliar flows their predictions drift step by step, errors concentrate in a few regions, yet retraining destabilizes the network and uniform post-hoc correction overlooks this spatial concentration. To address this, we propose a frozen-solver post-hoc correction framework, Adaptive Risk-Calibrated Spatial Triage for Auditable Refinement (ARC-STAR). ARC-STAR organizes correction into three stages: a global corrector removes broad solver bias, a blockwise local refiner cleans the post-global residual, and, at deployment, a label-free score routes refinement to high-risk blocks under a compute budget. The framework is designed to be (i) frozen-host, preserving the pretrained solver without fine-tuning; (ii) auditable, with global and local stages trained and evaluated separately for measurable contributions; and (iii) budget-aware, using a blockwise interface that either refines the full field or routes limited compute to high-risk regions. Across five flow benchmarks spanning ten regime cells, ARC-STAR is the only method that cuts velocity rollout error by at least 36x over raw Poseidon on every cell. The global stage reduces raw host error by 91-99%, and the local stage further reduces the remaining post-global residual by up to 94.4%. Our code implementation is available at https://anonymous.4open.science/r/arc_star.
CLMar 4, 2022
MM-DFN: Multimodal Dynamic Fusion Network for Emotion Recognition in ConversationsDou Hu, Xiaolong Hou, Lingwei Wei et al.
Emotion Recognition in Conversations (ERC) has considerable prospects for developing empathetic machines. For multimodal ERC, it is vital to understand context and fuse modality information in conversations. Recent graph-based fusion methods generally aggregate multimodal information by exploring unimodal and cross-modal interactions in a graph. However, they accumulate redundant information at each layer, limiting the context understanding between modalities. In this paper, we propose a novel Multimodal Dynamic Fusion Network (MM-DFN) to recognize emotions by fully understanding multimodal conversational context. Specifically, we design a new graph-based dynamic fusion module to fuse multimodal contextual features in a conversation. The module reduces redundancy and enhances complementarity between modalities by capturing the dynamics of contextual information in different semantic spaces. Extensive experiments on two public benchmark datasets demonstrate the effectiveness and superiority of MM-DFN.
CLJun 1, 2023
UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment AnalysisDou Hu, Lingwei Wei, Yaxin Liu et al.
This paper describes our system designed for SemEval-2023 Task 12: Sentiment analysis for African languages. The challenge faced by this task is the scarcity of labeled data and linguistic resources in low-resource settings. To alleviate these, we propose a generalized multilingual system SACL-XLMR for sentiment analysis on low-resource languages. Specifically, we design a lexicon-based multilingual BERT to facilitate language adaptation and sentiment-aware representation learning. Besides, we apply a supervised adversarial contrastive learning technique to learn sentiment-spread structured representations and enhance model generalization. Our system achieved competitive results, largely outperforming baselines on both multilingual and zero-shot sentiment classification subtasks. Notably, the system obtained the 1st rank on the zero-shot classification subtask in the official ranking. Extensive experiments demonstrate the effectiveness of our system.
CLNov 21, 2023
Explore the Potential of LLMs in Misinformation Detection: An Empirical StudyMengyang Chen, Lingwei Wei, Han Cao et al.
Large Language Models (LLMs) have garnered significant attention for their powerful ability in natural language understanding and reasoning. In this paper, we present a comprehensive empirical study to explore the performance of LLMs on misinformation detection tasks. This study stands as the pioneering investigation into the understanding capabilities of multiple LLMs regarding both content and propagation across social media platforms. Our empirical studies on eight misinformation detection datasets show that LLM-based detectors can achieve comparable performance in text-based misinformation detection but exhibit notably constrained capabilities in comprehending propagation structure compared to existing models in propagation-based misinformation detection. Our experiments further demonstrate that LLMs exhibit great potential to enhance existing misinformation detection models. These findings highlight the potential ability of LLMs to detect misinformation.
CLNov 29, 2023
Are Large Language Models Good Fact Checkers: A Preliminary StudyHan Cao, Lingwei Wei, Mengyang Chen et al.
Recently, Large Language Models (LLMs) have drawn significant attention due to their outstanding reasoning capabilities and extensive knowledge repository, positioning them as superior in handling various natural language processing tasks compared to other language models. In this paper, we present a preliminary investigation into the potential of LLMs in fact-checking. This study aims to comprehensively evaluate various LLMs in tackling specific fact-checking subtasks, systematically evaluating their capabilities, and conducting a comparative analysis of their performance against pre-trained and state-of-the-art low-parameter models. Experiments demonstrate that LLMs achieve competitive performance compared to other small models in most scenarios. However, they encounter challenges in effectively handling Chinese fact verification and the entirety of the fact-checking pipeline due to language inconsistencies and hallucinations. These findings underscore the need for further exploration and research to enhance the proficiency of LLMs as reliable fact-checkers, unveiling the potential capability of LLMs and the possible challenges in fact-checking tasks.
SIJul 13, 2024
Transferring Structure Knowledge: A New Task to Fake news Detection Towards Cold-Start PropagationLingwei Wei, Dou Hu, Wei Zhou et al.
Many fake news detection studies have achieved promising performance by extracting effective semantic and structure features from both content and propagation trees. However, it is challenging to apply them to practical situations, especially when using the trained propagation-based models to detect news with no propagation data. Towards this scenario, we study a new task named cold-start fake news detection, which aims to detect content-only samples with missing propagation. To achieve the task, we design a simple but effective Structure Adversarial Net (SAN) framework to learn transferable features from available propagation to boost the detection of content-only samples. SAN introduces a structure discriminator to estimate dissimilarities among learned features with and without propagation, and further learns structure-invariant features to enhance the generalization of existing propagation-based methods for content-only samples. We conduct qualitative and quantitative experiments on three datasets. Results show the challenge of the new task and the effectiveness of our SAN framework.
CLMay 4
An Information-theoretic Propagation Denoising and Fusion Framework for Fake News DetectionMengyang Chen, Lingwei Wei, Wei Zhou et al.
Incomplete propagation data significantly hinders robust fake news detection. Recent approaches leverage large language models to simulate missing user interactions via role-playing, thereby enriching propagation with synthetic signals. However, such propagation data is intrinsically unreliable, and directly fusing it can lead to biased representations, leading to limited detection performance. In this paper, we alleviate the unreliability of synthetic propagation from the mutual information perspective and propose a novel information-theoretic propagation denoising and fusion (InfoPDF) framework to learn effective representations from both real and synthetic propagation. Specifically, we first generate attribute-specific synthetic propagation using large language models. Then we model each synthetic propagation graph as a probabilistic latent distribution to guide reliability-aware adaptive fusion with real propagation. During training, we design a mutual information-based objective to learn compressed and task-sufficient propagation representations. It jointly suppresses noisy signals across attribute-specific synthetic propagation, maintains consistency between real and synthetic propagation representations, and ensures task sufficiency for fake news detection and attribute prediction. Experiments on three real-world datasets show that InfoPDF consistently achieves superior performance across various fake news detection tasks. Further analysis demonstrates that InfoPDF can estimate attribute-level reliabilities and learn more discriminative propagation representations.
CLApr 27
Propagation Structure-Semantic Transfer Learning for Robust Fake News DetectionMengyang Chen, Lingwei Wei, Han Cao et al.
Fake news generally refers to false information that is spread deliberately to deceive people, which has detrimental social effects. Existing fake news detection methods primarily learn the semantic features from news content or integrate structural features from propagation. However, in practical scenarios, due to the semantic ambiguity of informal language and unreliable user interactive behaviors on social media, there are inherent semantic and structural noises in news content and propagation. Although some recent works consider the effect of irrelevant user interactions in a hybrid-modeling way, they still suffer from the mutual interference between structural noise and semantic noise, leading to limited performance for robust detection. To alleviate this issue, this paper proposes a novel Propagation Structure-Semantic Transfer Learning framework (PSS-TL) for robust fake news detection under a teacher-student architecture. Specifically, we design dual teacher models to learn semantics knowledge and structure knowledge from noisy news content and propagation structure independently. Besides, we design a Multi-channel Knowledge Distillation (MKD) loss to enable the student model to acquire specialized knowledge from the teacher models, thereby avoiding mutual interference. Extensive experiments on two real-world datasets validate the effectiveness and robustness of our method.
CLDec 21, 2023
Structured Probabilistic CodingDou Hu, Lingwei Wei, Yaxin Liu et al.
This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only probabilistic coding technology with a structured regularization from the target space. It can enhance the generalization ability of pre-trained language models for better language understanding. Specifically, our probabilistic coding simultaneously performs information encoding and task prediction in one module to more fully utilize the effective information from input data. It uses variational inference in the output space to reduce randomness and uncertainty. Besides, to better control the learning process of probabilistic representations, a structured regularization is proposed to promote uniformity across classes in the latent space. With the regularization term, SPC can preserve the Gaussian structure of the latent code and achieve better coverage of the hidden space with class uniformly. Experimental results on 12 natural language understanding tasks demonstrate that our SPC effectively improves the performance of pre-trained language models for classification and regression. Extensive experiments show that SPC can enhance the generalization capability, robustness to label noise, and clustering quality of output representations.
CLMar 6, 2025
An Information-theoretic Multi-task Representation Learning Framework for Natural Language UnderstandingDou Hu, Lingwei Wei, Wei Zhou et al.
This paper proposes a new principled multi-task representation learning framework (InfoMTL) to extract noise-invariant sufficient representations for all tasks. It ensures sufficiency of shared representations for all tasks and mitigates the negative effect of redundant features, which can enhance language understanding of pre-trained language models (PLMs) under the multi-task paradigm. Firstly, a shared information maximization principle is proposed to learn more sufficient shared representations for all target tasks. It can avoid the insufficiency issue arising from representation compression in the multi-task paradigm. Secondly, a task-specific information minimization principle is designed to mitigate the negative effect of potential redundant features in the input for each task. It can compress task-irrelevant redundant information and preserve necessary information relevant to the target for multi-task prediction. Experiments on six classification benchmarks show that our method outperforms 12 comparative multi-task methods under the same multi-task settings, especially in data-constrained and noisy scenarios. Extensive experiments demonstrate that the learned representations are more sufficient, data-efficient, and robust.
CLMar 11, 2025
Enhancing Multi-Hop Fact Verification with Structured Knowledge-Augmented Large Language ModelsHan Cao, Lingwei Wei, Wei Zhou et al.
The rapid development of social platforms exacerbates the dissemination of misinformation, which stimulates the research in fact verification. Recent studies tend to leverage semantic features to solve this problem as a single-hop task. However, the process of verifying a claim requires several pieces of evidence with complicated inner logic and relations to verify the given claim in real-world situations. Recent studies attempt to improve both understanding and reasoning abilities to enhance the performance, but they overlook the crucial relations between entities that benefit models to understand better and facilitate the prediction. To emphasize the significance of relations, we resort to Large Language Models (LLMs) considering their excellent understanding ability. Instead of other methods using LLMs as the predictor, we take them as relation extractors, for they do better in understanding rather than reasoning according to the experimental results. Thus, to solve the challenges above, we propose a novel Structured Knowledge-Augmented LLM-based Network (LLM-SKAN) for multi-hop fact verification. Specifically, we utilize an LLM-driven Knowledge Extractor to capture fine-grained information, including entities and their complicated relations. Besides, we leverage a Knowledge-Augmented Relation Graph Fusion module to interact with each node and learn better claim-evidence representations comprehensively. The experimental results on four common-used datasets demonstrate the effectiveness and superiority of our model.
LGJun 8, 2024
Representation Learning with Conditional Information Flow MaximizationDou Hu, Lingwei Wei, Wei Zhou et al.
This paper proposes an information-theoretic representation learning framework, named conditional information flow maximization, to extract noise-invariant sufficient representations for the input data and target task. It promotes the learned representations have good feature uniformity and sufficient predictive ability, which can enhance the generalization of pre-trained language models (PLMs) for the target task. Firstly, an information flow maximization principle is proposed to learn more sufficient representations for the input and target by simultaneously maximizing both input-representation and representation-label mutual information. Unlike the information bottleneck, we handle the input-representation information in an opposite way to avoid the over-compression issue of latent representations. Besides, to mitigate the negative effect of potential redundant features from the input, we design a conditional information minimization principle to eliminate negative redundant features while preserve noise-invariant features. Experiments on 13 language understanding benchmarks demonstrate that our method effectively improves the performance of PLMs for classification and regression. Extensive experiments show that the learned representations are more sufficient, robust and transferable.
AIJul 26, 2021
Towards Propagation Uncertainty: Edge-enhanced Bayesian Graph Convolutional Networks for Rumor DetectionLingwei Wei, Dou Hu, Wei Zhou et al.
Detecting rumors on social media is a very critical task with significant implications to the economy, public health, etc. Previous works generally capture effective features from texts and the propagation structure. However, the uncertainty caused by unreliable relations in the propagation structure is common and inevitable due to wily rumor producers and the limited collection of spread data. Most approaches neglect it and may seriously limit the learning of features. Towards this issue, this paper makes the first attempt to explore propagation uncertainty for rumor detection. Specifically, we propose a novel Edge-enhanced Bayesian Graph Convolutional Network (EBGCN) to capture robust structural features. The model adaptively rethinks the reliability of latent relations by adopting a Bayesian approach. Besides, we design a new edge-wise consistency training framework to optimize the model by enforcing consistency on relations. Experiments on three public benchmark datasets demonstrate that the proposed model achieves better performance than baseline methods on both rumor detection and early rumor detection tasks.
IRJun 17, 2021
PEN4Rec: Preference Evolution Networks for Session-based RecommendationDou Hu, Lingwei Wei, Wei Zhou et al.
Session-based recommendation aims to predict user the next action based on historical behaviors in an anonymous session. For better recommendations, it is vital to capture user preferences as well as their dynamics. Besides, user preferences evolve over time dynamically and each preference has its own evolving track. However, most previous works neglect the evolving trend of preferences and can be easily disturbed by the effect of preference drifting. In this paper, we propose a novel Preference Evolution Networks for session-based Recommendation (PEN4Rec) to model preference evolving process by a two-stage retrieval from historical contexts. Specifically, the first-stage process integrates relevant behaviors according to recent items. Then, the second-stage process models the preference evolving trajectory over time dynamically and infer rich preferences. The process can strengthen the effect of relevant sequential behaviors during the preference evolution and weaken the disturbance from preference drifting. Extensive experiments on three public datasets demonstrate the effectiveness and superiority of the proposed model.
CLJun 3, 2021
DialogueCRN: Contextual Reasoning Networks for Emotion Recognition in ConversationsDou Hu, Lingwei Wei, Xiaoyong Huai
Emotion Recognition in Conversations (ERC) has gained increasing attention for developing empathetic machines. Recently, many approaches have been devoted to perceiving conversational context by deep learning models. However, these approaches are insufficient in understanding the context due to lacking the ability to extract and integrate emotional clues. In this work, we propose novel Contextual Reasoning Networks (DialogueCRN) to fully understand the conversational context from a cognitive perspective. Inspired by the Cognitive Theory of Emotion, we design multi-turn reasoning modules to extract and integrate emotional clues. The reasoning module iteratively performs an intuitive retrieving process and a conscious reasoning process, which imitates human unique cognitive thinking. Extensive experiments on three public benchmark datasets demonstrate the effectiveness and superiority of the proposed model.
CLJul 16, 2020
Hierarchical Interaction Networks with Rethinking Mechanism for Document-level Sentiment AnalysisLingwei Wei, Dou Hu, Wei Zhou et al.
Document-level Sentiment Analysis (DSA) is more challenging due to vague semantic links and complicate sentiment information. Recent works have been devoted to leveraging text summarization and have achieved promising results. However, these summarization-based methods did not take full advantage of the summary including ignoring the inherent interactions between the summary and document. As a result, they limited the representation to express major points in the document, which is highly indicative of the key sentiment. In this paper, we study how to effectively generate a discriminative representation with explicit subject patterns and sentiment contexts for DSA. A Hierarchical Interaction Networks (HIN) is proposed to explore bidirectional interactions between the summary and document at multiple granularities and learn subject-oriented document representations for sentiment classification. Furthermore, we design a Sentiment-based Rethinking mechanism (SR) by refining the HIN with sentiment label information to learn a more sentiment-aware document representation. We extensively evaluate our proposed models on three public datasets. The experimental results consistently demonstrate the effectiveness of our proposed models and show that HIN-SR outperforms various state-of-the-art methods.
CLJul 16, 2020
SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NERDou Hu, Lingwei Wei
Although character-based models using lexicon have achieved promising results for Chinese named entity recognition (NER) task, some lexical words would introduce erroneous information due to wrongly matched words. Existing researches proposed many strategies to integrate lexicon knowledge. However, they performed with simple first-order lexicon knowledge, which provided insufficient word information and still faced the challenge of matched word boundary conflicts; or explored the lexicon knowledge with graph where higher-order information introducing negative words may disturb the identification. To alleviate the above limitations, we present new insight into second-order lexicon knowledge (SLK) of each character in the sentence to provide more lexical word information including semantic and word boundary features. Based on these, we propose a SLK-based model with a novel strategy to integrate the above lexicon knowledge. The proposed model can exploit more discernible lexical words information with the help of global context. Experimental results on three public datasets demonstrate the validity of SLK. The proposed model achieves more excellent performance than the state-of-the-art comparison methods.