Ndapa Nakashole

CL
h-index3
11papers
4,703citations
Novelty40%
AI Score31

11 Papers

CLDec 21, 2022Code
Zero-shot Triplet Extraction by Template Infilling

Bosung Kim, Hayate Iso, Nikita Bhutani et al. · nvidia

The task of triplet extraction aims to extract pairs of entities and their corresponding relations from unstructured text. Most existing methods train an extraction model on training data involving specific target relations, and are incapable of extracting new relations that were not observed at training time. Generalizing the model to unseen relations typically requires fine-tuning on synthetic training data which is often noisy and unreliable. We show that by reducing triplet extraction to a template infilling task over a pre-trained language model (LM), we can equip the extraction model with zero-shot learning capabilities and eliminate the need for additional training data. We propose a novel framework, ZETT (ZEro-shot Triplet extraction by Template infilling), that aligns the task objective to the pre-training objective of generative transformers to generalize to unseen relations. Experiments on FewRel and Wiki-ZSL datasets demonstrate that ZETT shows consistent and stable performance, outperforming previous state-of-the-art methods, even when using automatically generated templates. https://github.com/megagonlabs/zett/

CLSep 30, 2022
Medical Question Understanding and Answering with Knowledge Grounding and Semantic Self-Supervision

Khalil Mrini, Harpreet Singh, Franck Dernoncourt et al.

Current medical question answering systems have difficulty processing long, detailed and informally worded questions submitted by patients, called Consumer Health Questions (CHQs). To address this issue, we introduce a medical question understanding and answering system with knowledge grounding and semantic self-supervision. Our system is a pipeline that first summarizes a long, medical, user-written question, using a supervised summarization loss. Then, our system performs a two-step retrieval to return answers. The system first matches the summarized user question with an FAQ from a trusted medical knowledge base, and then retrieves a fixed number of relevant sentences from the corresponding answer document. In the absence of labels for question matching or answer relevance, we design 3 novel, self-supervised and semantically-guided losses. We evaluate our model against two strong retrieval-based question answering baselines. Evaluators ask their own questions and rate the answers retrieved by our baselines and own system according to their relevance. They find that our system retrieves more relevant answers, while achieving speeds 20 times faster. Our self-supervised losses also help the summarizer achieve higher scores in ROUGE, as well as in human evaluation metrics. We release our code to encourage further research.

CLApr 3, 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL

Yutong Shao, Ndapa Nakashole

Structured data, prevalent in tables, databases, and knowledge graphs, poses a significant challenge in its representation. With the advent of large language models (LLMs), there has been a shift towards linearization-based methods, which process structured data as sequential token streams, diverging from approaches that explicitly model structure, often as a graph. Crucially, there remains a gap in our understanding of how these linearization-based methods handle structured data, which is inherently non-linear. This work investigates the linear handling of structured data in encoder-decoder language models, specifically T5. Our findings reveal the model's ability to mimic human-designed processes such as schema linking and syntax prediction, indicating a deep, meaningful learning of structure beyond simple token sequencing. We also uncover insights into the model's internal mechanisms, including the ego-centric nature of structure node encodings and the potential for model compression due to modality fusion redundancy. Overall, this work sheds light on the inner workings of linearization-based methods and could potentially provide guidance for future research.

CLNov 28, 2021
A Grounded Well-being Conversational Agent with Multiple Interaction Modes: Preliminary Results

Xinxin Yan, Ndapa Nakashole

Technologies for enhancing well-being, healthcare vigilance and monitoring are on the rise. However, despite patient interest, such technologies suffer from low adoption. One hypothesis for this limited adoption is loss of human interaction that is central to doctor-patient encounters. In this paper we seek to address this limitation via a conversational agent that adopts one aspect of in-person doctor-patient interactions: A human avatar to facilitate medical grounded question answering. This is akin to the in-person scenario where the doctor may point to the human body or the patient may point to their own body to express their conditions. Additionally, our agent has multiple interaction modes, that may give more options for the patient to use the agent, not just for medical question answering, but also to engage in conversations about general topics and current events. Both the avatar, and the multiple interaction modes could help improve adherence. We present a high level overview of the design of our agent, Marie Bot Wellbeing. We also report implementation details of our early prototype , and present preliminary results.

CLMar 22, 2021
Extracting the Unknown from Long Math Problems

Ndapa Nakashole

In problem solving, understanding the problem that one seeks to solve is an essential initial step. In this paper, we propose computational methods for facilitating problem understanding through the task of recognizing the unknown in specifications of long Math problems. We focus on the topic of Probability. Our experimental results show that learning models yield strong results on the task, a promising first step towards human interpretable, modular approaches to understanding long Math problems.

CLNov 10, 2019
Rethinking Self-Attention: Towards Interpretability in Neural Parsing

Khalil Mrini, Franck Dernoncourt, Quan Tran et al.

Attention mechanisms have improved the performance of NLP tasks while allowing models to remain explainable. Self-attention is currently widely used, however interpretability is difficult due to the numerous attention distributions. Recent work has shown that model representations can benefit from label-specific information, while facilitating interpretation of predictions. We introduce the Label Attention Layer: a new form of self-attention where attention heads represent labels. We test our novel layer by running constituency and dependency parsing experiments and show our new model obtains new state-of-the-art results for both tasks on both the Penn Treebank (PTB) and Chinese Treebank. Additionally, our model requires fewer self-attention layers compared to existing work. Finally, we find that the Label Attention heads learn relations between syntactic categories and show pathways to analyze errors.

CLMay 31, 2019
Fine-Grained Spoiler Detection from Large-Scale Review Corpora

Mengting Wan, Rishabh Misra, Ndapa Nakashole et al.

This paper presents computational approaches for automatically detecting critical plot twists in reviews of media products. First, we created a large-scale book review dataset that includes fine-grained spoiler annotations at the sentence-level, as well as book and (anonymized) user information. Second, we carefully analyzed this dataset, and found that: spoiler language tends to be book-specific; spoiler distributions vary greatly across books and review authors; and spoiler sentences tend to jointly appear in the latter part of reviews. Third, inspired by these findings, we developed an end-to-end neural network architecture to detect spoiler sentences in review corpora. Quantitative and qualitative results demonstrate that the proposed method substantially outperforms existing baselines.

CLNov 17, 2018
Sense Perception Common Sense Relationships

Ndapa Nakashole

Often missing in existing knowledge bases of facts, are relationships that encode common sense knowledge about unnamed entities. In this paper, we propose to extract novel, common sense relationships pertaining to sense perception concepts such as sound and smell.

CLNov 17, 2018
Unnamed Entity Recognition of Sense Mentions

Ndapa Nakashole

We consider the problem of recognizing mentions of human senses in text. Our contribution is a method for acquiring labeled data, and a learning method that is trained on this data. Experiments show the effectiveness of our proposed data labeling approach and our learning model on the task of sense recognition in text.

CLNov 17, 2018
Bilingual Dictionary Induction for Bantu Languages

Ndapa Nakashole

We present a method for learning bilingual translation dictionaries between English and Bantu languages. We show that exploiting the grammatical structure common to Bantu languages enables bilingual dictionary induction for languages where training data is unavailable.

CLJun 7, 2018
Characterizing Departures from Linearity in Word Translation

Ndapa Nakashole, Raphael Flauger

We investigate the behavior of maps learned by machine translation methods. The maps translate words by projecting between word embedding spaces of different languages. We locally approximate these maps using linear maps, and find that they vary across the word embedding space. This demonstrates that the underlying maps are non-linear. Importantly, we show that the locally linear maps vary by an amount that is tightly correlated with the distance between the neighborhoods on which they are trained. Our results can be used to test non-linear methods, and to drive the design of more accurate maps for word translation.