CLMay 16, 2023
MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event DetectionSiyuan Wang, Jianming Zheng, Xuejun Hu et al.
Event detection (ED) is aimed to identify the key trigger words in unstructured text and predict the event types accordingly. Traditional ED models are too data-hungry to accommodate real applications with scarce labeled data. Besides, typical ED models are facing the context-bypassing and disabled generalization issues caused by the trigger bias stemming from ED datasets. Therefore, we focus on the true few-shot paradigm to satisfy the low-resource scenarios. In particular, we propose a multi-step prompt learning model (MsPrompt) for debiasing few-shot event detection, that consists of the following three components: an under-sampling module targeting to construct a novel training set that accommodates the true few-shot setting, a multi-step prompt module equipped with a knowledge-enhanced ontology to leverage the event semantics and latent prior knowledge in the PLMs sufficiently for tackling the context-bypassing problem, and a prototypical module compensating for the weakness of classifying events with sparse data and boost the generalization performance. Experiments on two public datasets ACE-2005 and FewEvent show that MsPrompt can outperform the state-of-the-art models, especially in the strict low-resource scenarios reporting 11.43% improvement in terms of weighted F1-score against the best-performing baseline and achieving an outstanding debiasing performance.
IRMay 9, 2020
Rethinking Item Importance in Session-based RecommendationZhiqiang Pan, Fei Cai, Yanxiang Ling et al.
Session-based recommendation aims to predict users' based on anonymous sessions. Previous work mainly focuses on the transition relationship between items during an ongoing session. They generally fail to pay enough attention to the importance of the items in terms of their relevance to user's main intent. In this paper, we propose a Session-based Recommendation approach with an Importance Extraction Module, i.e., SR-IEM, that considers both a user's long-term and recent behavior in an ongoing session. We employ a modified self-attention mechanism to estimate item importance in a session, which is then used to predict user's long-term preference. Item recommendations are produced by combining the user's long-term preference and current interest as conveyed by the last interacted item. Experiments conducted on two benchmark datasets validate that SR-IEM outperforms the start-of-the-art in terms of Recall and MRR and has a reduced computational complexity.
IRSep 26, 2019
Pre-train, Interact, Fine-tune: A Novel Interaction Representation for Text ClassificationJianming Zheng, Fei Cai, Honghui Chen et al.
Text representation can aid machines in understanding text. Previous work on text representation often focuses on the so-called forward implication, i.e., preceding words are taken as the context of later words for creating representations, thus ignoring the fact that the semantics of a text segment is a product of the mutual implication of words in the text: later words contribute to the meaning of preceding words. We introduce the concept of interaction and propose a two-perspective interaction representation, that encapsulates a local and a global interaction representation. Here, a local interaction representation is one that interacts among words with parent-children relationships on the syntactic trees and a global interaction interpretation is one that interacts among all the words in a sentence. We combine the two interaction representations to develop a Hybrid Interaction Representation (HIR). Inspired by existing feature-based and fine-tuning-based pretrain-finetuning approaches to language models, we integrate the advantages of feature-based and fine-tuning-based methods to propose the Pre-train, Interact, Fine-tune (PIF) architecture. We evaluate our proposed models on five widely-used datasets for text classification tasks. Our ensemble method, outperforms state-of-the-art baselines with improvements ranging from 2.03% to 3.15% in terms of error rate. In addition, we find that, the improvements of PIF against most state-of-the-art methods is not affected by increasing of the length of the text.
IRAug 27, 2019
Improving End-to-End Sequential Recommendations with Intent-aware DiversificationWanyu Chen, Pengjie Ren, Fei Cai et al.
Sequential Recommendation (SRs) that capture users' dynamic intents by modeling user sequential behaviors can recommend closely accurate products to users. Previous work on SRs is mostly focused on optimizing the recommendation accuracy, often ignoring the recommendation diversity, even though it is an important criterion for evaluating the recommendation performance. Most existing methods for improving the diversity of recommendations are not ideally applicable for SRs because they assume that user intents are static and rely on post-processing the list of recommendations to promote diversity. We consider both recommendation accuracy and diversity for SRs by proposing an end-to-end neural model, called Intent-aware Diversified Sequential Recommendation (IDSR). Specifically, we introduce an Implicit Intent Mining module (IIM) into SRs to capture different user intents reflected in user behavior sequences. Then, we design an Intent-aware Diversity Promoting (IDP) loss to supervise the learning of the IIM module and force the model to take recommendation diversity into consideration during training. Extensive experiments on two benchmark datasets show that IDSR significantly outperforms state-of-the-art methods in terms of recommendation diversity while yielding comparable or superior recommendation accuracy.
IRJul 8, 2019
Joint Neural Collaborative Filtering for Recommender SystemsWanyu Chen, Fei Cai, Honghui Chen et al.
We propose a J-NCF method for recommender systems. The J-NCF model applies a joint neural network that couples deep feature learning and deep interaction modeling with a rating matrix. Deep feature learning extracts feature representations of users and items with a deep learning architecture based on a user-item rating matrix. Deep interaction modeling captures non-linear user-item interactions with a deep neural network using the feature representations generated by the deep feature learning process as input. J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training, which leads to improved recommendation performance. In addition, we design a new loss function for optimization, which takes both implicit and explicit feedback, point-wise and pair-wise loss into account. Experiments on several real-word datasets show significant improvements of J-NCF over state-of-the-art methods, with improvements of up to 8.24% on the MovieLens 100K dataset, 10.81% on the MovieLens 1M dataset, and 10.21% on the Amazon Movies dataset in terms of HR@10. NDCG@10 improvements are 12.42%, 14.24% and 15.06%, respectively. We also conduct experiments to evaluate the scalability and sensitivity of J-NCF. Our experiments show that the J-NCF model has a competitive recommendation performance with inactive users and different degrees of data sparsity when compared to state-of-the-art baselines.
IRMay 8, 2018
Attention-based Hierarchical Neural Query SuggestionWanyu Chen, Fei Cai, Honghui Chen et al.
Query suggestions help users of a search engine to refine their queries. Previous work on query suggestion has mainly focused on incorporating directly observable features such as query co-occurrence and semantic similarity. The structure of such features is often set manually, as a result of which hidden dependencies between queries and users may be ignored. We propose an AHNQS model that combines a hierarchical structure with a session-level neural network and a user-level neural network to model the short- and long-term search history of a user. An attention mechanism is used to capture user preferences. We quantify the improvements of AHNQS over state-of-the-art RNN-based query suggestion baselines on the AOL query log dataset, with improvements of up to 21.86% and 22.99% in terms of MRR@10 and Recall@10, respectively, over the state-of-the-art; improvements are especially large for short sessions.