Sha Li

CL
h-index14
7papers
1,522citations
Novelty49%
AI Score36

7 Papers

24.3CLNov 3, 2022Code
Open-Vocabulary Argument Role Prediction for Event Extraction

Yizhu Jiao, Sha Li, Yiqing Xie et al.

The argument role in event extraction refers to the relation between an event and an argument participating in it. Despite the great progress in event extraction, existing studies still depend on roles pre-defined by domain experts. These studies expose obvious weakness when extending to emerging event types or new domains without available roles. Therefore, more attention and effort needs to be devoted to automatically customizing argument roles. In this paper, we define this essential but under-explored task: open-vocabulary argument role prediction. The goal of this task is to infer a set of argument roles for a given event type. We propose a novel unsupervised framework, RolePred for this task. Specifically, we formulate the role prediction problem as an in-filling task and construct prompts for a pre-trained language model to generate candidate roles. By extracting and analyzing the candidate arguments, the event-specific roles are further merged and selected. To standardize the research of this task, we collect a new event extraction dataset from WikiPpedia including 142 customized argument roles with rich semantics. On this dataset, RolePred outperforms the existing methods by a large margin. Source code and dataset are available on our GitHub repository: https://github.com/yzjiao/RolePred

24.2CLNov 30, 2022Code
Open Relation and Event Type Discovery with Type Abstraction

Sha Li, Heng Ji, Jiawei Han

Conventional closed-world information extraction (IE) approaches rely on human ontologies to define the scope for extraction. As a result, such approaches fall short when applied to new domains. This calls for systems that can automatically infer new types from given corpora, a task which we refer to as type discovery. To tackle this problem, we introduce the idea of type abstraction, where the model is prompted to generalize and name the type. Then we use the similarity between inferred names to induce clusters. Observing that this abstraction-based representation is often complementary to the entity/trigger token representation, we set up these two representations as two views and design our model as a co-training framework. Our experiments on multiple relation extraction and event extraction datasets consistently show the advantage of our type abstraction approach. Code available at https://github.com/raspberryice/type-discovery-abs.

3.0CLJun 15, 2022
Enhanced Knowledge Selection for Grounded Dialogues via Document Semantic Graphs

Sha Li, Mahdi Namazifar, Di Jin et al.

Providing conversation models with background knowledge has been shown to make open-domain dialogues more informative and engaging. Existing models treat knowledge selection as a sentence ranking or classification problem where each sentence is handled individually, ignoring the internal semantic connection among sentences in the background document. In this work, we propose to automatically convert the background knowledge documents into document semantic graphs and then perform knowledge selection over such graphs. Our document semantic graphs preserve sentence-level information through the use of sentence nodes and provide concept connections between sentences. We jointly apply multi-task learning for sentence-level and concept-level knowledge selection and show that it improves sentence-level selection. Our experiments show that our semantic graph-based knowledge selection improves over sentence selection baselines for both the knowledge selection task and the end-to-end response generation task on HollE and improves generalization on unseen topics in WoW.

21.8CLOct 31, 2023
Defining a New NLP Playground

Sha Li, Chi Han, Pengfei Yu et al.

The recent explosion of performance of large language models (LLMs) has changed the field of Natural Language Processing (NLP) more abruptly and seismically than any other shift in the field's 80-year history. This has resulted in concerns that the field will become homogenized and resource-intensive. The new status quo has put many academic researchers, especially PhD students, at a disadvantage. This paper aims to define a new NLP playground by proposing 20+ PhD-dissertation-worthy research directions, covering theoretical analysis, new and challenging problems, learning paradigms, and interdisciplinary applications.

6.7CLFeb 18, 2025
Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation

Sha Li, Naren Ramakrishnan

Retrieval-Augmented Generation (RAG) aims to augment the capabilities of Large Language Models (LLMs) by retrieving and incorporate external documents or chunks prior to generation. However, even improved retriever relevance can brings erroneous or contextually distracting information, undermining the effectiveness of RAG in downstream tasks. We introduce a compact, efficient, and pluggable module designed to refine retrieved chunks before using them for generation. The module aims to extract and reorganize the most relevant and supportive information into a concise, query-specific format. Through a three-stage training paradigm - comprising supervised fine - tuning, contrastive multi-task learning, and reinforcement learning-based alignment - it prioritizes critical knowledge and aligns it with the generator's preferences. This approach enables LLMs to produce outputs that are more accurate, reliable, and contextually appropriate.

1.4CLFeb 15, 2022
P4E: Few-Shot Event Detection as Prompt-Guided Identification and Localization

Sha Li, Liyuan Liu, Yiqing Xie et al.

We propose P4E, an identify-and-localize event detection framework that integrates the best of few-shot prompting and structured prediction. Our framework decomposes event detection into an identification task and a localization task. For the identification task, which we formulate as multi-label classification, we leverage cloze-based prompting to align our objective with the pre-training task of language models, allowing our model to quickly adapt to new event types. We then employ an event type-agnostic sequence labeling model to localize the event trigger conditioned on the identification output. This heterogeneous model design allows P4E to quickly learn new event types without sacrificing the ability to make structured predictions. Our experiments demonstrate the effectiveness of our proposed design, and P4E shows superior performance for few-shot event detection on benchmark datasets FewEvent and MAVEN and comparable performance to SOTA for fully-supervised event detection on ACE.

33.1CLApr 13, 2021Code
Document-Level Event Argument Extraction by Conditional Generation

Sha Li, Heng Ji, Jiawei Han

Event extraction has long been treated as a sentence-level task in the IE community. We argue that this setting does not match human information-seeking behavior and leads to incomplete and uninformative extraction results. We propose a document-level neural event argument extraction model by formulating the task as conditional generation following event templates. We also compile a new document-level event extraction benchmark dataset WikiEvents which includes complete event and coreference annotation. On the task of argument extraction, we achieve an absolute gain of 7.6% F1 and 5.7% F1 over the next best model on the RAMS and WikiEvents datasets respectively. On the more challenging task of informative argument extraction, which requires implicit coreference reasoning, we achieve a 9.3% F1 gain over the best baseline. To demonstrate the portability of our model, we also create the first end-to-end zero-shot event extraction framework and achieve 97% of fully supervised model's trigger extraction performance and 82% of the argument extraction performance given only access to 10 out of the 33 types on ACE.