Ting Liu

h-index20

9papers

1,745citations

Novelty41%

AI Score33

Ranked #120,688 of 194,257 authors (top 62%)#22,121 in CL (top 72%)

9 Papers

37.7SPJul 6, 2023

UniCoRN: Unified Cognitive Signal ReconstructioN bridging cognitive signals and human language

Nuwa Xi, Sendong Zhao, Haochun Wang et al.

Decoding text stimuli from cognitive signals (e.g. fMRI) enhances our understanding of the human language system, paving the way for building versatile Brain-Computer Interface. However, existing studies largely focus on decoding individual word-level fMRI volumes from a restricted vocabulary, which is far too idealized for real-world application. In this paper, we propose fMRI2text, the first openvocabulary task aiming to bridge fMRI time series and human language. Furthermore, to explore the potential of this new task, we present a baseline solution, UniCoRN: the Unified Cognitive Signal ReconstructioN for Brain Decoding. By reconstructing both individual time points and time series, UniCoRN establishes a robust encoder for cognitive signals (fMRI & EEG). Leveraging a pre-trained language model as decoder, UniCoRN proves its efficacy in decoding coherent text from fMRI series across various split settings. Our model achieves a 34.77% BLEU score on fMRI2text, and a 37.04% BLEU when generalized to EEGto-text decoding, thereby surpassing the former baseline. Experimental results indicate the feasibility of decoding consecutive fMRI volumes, and the effectiveness of decoding different cognitive signals using a unified structure.

27.3CLSep 24, 2020Code

N-LTP: An Open-source Neural Language Technology Platform for Chinese

Wanxiang Che, Yunlong Feng, Libo Qin et al.

We introduce \texttt{N-LTP}, an open-source neural language technology platform supporting six fundamental Chinese NLP tasks: {lexical analysis} (Chinese word segmentation, part-of-speech tagging, and named entity recognition), {syntactic parsing} (dependency parsing), and {semantic parsing} (semantic dependency parsing and semantic role labeling). Unlike the existing state-of-the-art toolkits, such as \texttt{Stanza}, that adopt an independent model for each task, \texttt{N-LTP} adopts the multi-task framework by using a shared pre-trained model, which has the advantage of capturing the shared knowledge across relevant Chinese tasks. In addition, a knowledge distillation method \cite{DBLP:journals/corr/abs-1907-04829} where the single-task model teaches the multi-task model is further introduced to encourage the multi-task model to surpass its single-task teacher. Finally, we provide a collection of easy-to-use APIs and a visualization tool to make users to use and view the processing results more easily and directly. To the best of our knowledge, this is the first toolkit to support six Chinese NLP fundamental tasks. Source code, documentation, and pre-trained models are available at \url{https://github.com/HIT-SCIR/ltp}.

7.2CLNov 14, 2024

Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

Longxuan Ma, Mingda Li, Weinan Zhang et al.

Incorporating external knowledge into dialogue generation has been proven to benefit the performance of an open-domain Dialogue System (DS), such as generating informative or stylized responses, controlling conversation topics. In this article, we study the open-domain DS that uses unstructured text as external knowledge sources (\textbf{U}nstructured \textbf{T}ext \textbf{E}nhanced \textbf{D}ialogue \textbf{S}ystem, \textbf{UTEDS}). The existence of unstructured text entails distinctions between UTEDS and traditional data-driven DS and we aim to analyze these differences. We first give the definition of the UTEDS related concepts, then summarize the recently released datasets and models. We categorize UTEDS into Retrieval and Generative models and introduce them from the perspective of model components. The retrieval models consist of Fusion, Matching, and Ranking modules, while the generative models comprise Dialogue and Knowledge Encoding, Knowledge Selection, and Response Generation modules. We further summarize the evaluation methods utilized in UTEDS and analyze the current models' performance. At last, we discuss the future development trends of UTEDS, hoping to inspire new research in this field.

32.2CLJun 11, 2021Code

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data

Haoyu Song, Yan Wang, Kaiyan Zhang et al.

Maintaining consistent personas is essential for dialogue agents. Although tremendous advancements have been brought, the limited-scale of annotated persona-dense data are still barriers towards training robust and consistent persona-based dialogue models. In this work, we show how the challenges can be addressed by disentangling persona-based dialogue generation into two sub-tasks with a novel BERT-over-BERT (BoB) model. Specifically, the model consists of a BERT-based encoder and two BERT-based decoders, where one decoder is for response generation, and another is for consistency understanding. In particular, to learn the ability of consistency understanding from large-scale non-dialogue inference data, we train the second decoder in an unlikelihood manner. Under different limited data settings, both automatic and human evaluations demonstrate that the proposed model outperforms strong baselines in response quality and persona consistency.

2.7CLOct 11, 2020

Few-shot Learning for Multi-label Intent Detection

Yutai Hou, Yongkui Lai, Yushan Wu et al.

In this paper, we study the few-shot multi-label classification for user intent detection. For multi-label intent detection, state-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels. To determine appropriate thresholds with only a few examples, we first learn universal thresholding experience on data-rich domains, and then adapt the thresholds to certain few-shot domains with a calibration based on nonparametric learning. For better calculation of label-instance relevance score, we introduce label name embedding as anchor points in representation space, which refines representations of different classes to be well-separated from each other. Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.

1.1CLOct 8, 2020Code

Injecting Word Information with Multi-Level Word Adapter for Chinese Spoken Language Understanding

Dechuan Teng, Libo Qin, Wanxiang Che et al.

In this paper, we improve Chinese spoken language understanding (SLU) by injecting word information. Previous studies on Chinese SLU do not consider the word information, failing to detect word boundaries that are beneficial for intent detection and slot filling. To address this issue, we propose a multi-level word adapter to inject word information for Chinese SLU, which consists of (1) sentence-level word adapter, which directly fuses the sentence representations of the word information and character information to perform intent detection and (2) character-level word adapter, which is applied at each character for selectively controlling weights on word information as well as character information. Experimental results on two Chinese SLU datasets show that our model can capture useful word information and achieve state-of-the-art performance.

2.0CLJun 20, 2019

Few-Shot Sequence Labeling with Label Dependency Transfer and Pair-wise Embedding

Yutai Hou, Zhihan Zhou, Yijia Liu et al.

While few-shot classification has been widely explored with similarity based methods, few-shot sequence labeling poses a unique challenge as it also calls for modeling the label dependencies. To consider both the item similarity and label dependency, we propose to leverage the conditional random fields (CRFs) in few-shot sequence labeling. It calculates emission score with similarity based methods and obtains transition score with a specially designed transfer mechanism. When applying CRF in the few-shot scenarios, the discrepancy of label sets among different domains makes it hard to use the label dependency learned in prior domains. To tackle this, we introduce the dependency transfer mechanism that transfers abstract label transition patterns. In addition, the similarity methods rely on the high quality sample representation, which is challenging for sequence labeling, because sense of a word is different when measuring its similarity to words in different sentences. To remedy this, we take advantage of recent contextual embedding technique, and further propose a pair-wise embedder. It provides additional certainty for word sense by embedding query and support sentence pairwisely. Experimental results on slot tagging and named entity recognition show that our model significantly outperforms the strongest few-shot learning baseline by 11.76 (21.2%) and 12.18 (97.7%) F1 scores respectively in the one-shot setting.

4.6CVNov 28, 2016

Improving Fully Convolution Network for Semantic Segmentation

Bing Shuai, Ting Liu, Gang Wang

Fully Convolution Networks (FCN) have achieved great success in dense prediction tasks including semantic segmentation. In this paper, we start from discussing FCN by understanding its architecture limitations in building a strong segmentation network. Next, we present our Improved Fully Convolution Network (IFCN). In contrast to FCN, IFCN introduces a context network that progressively expands the receptive fields of feature maps. In addition, dense skip connections are added so that the context network can be effectively optimized. More importantly, these dense skip connections enable IFCN to fuse rich-scale context to make reliable predictions. Empirically, those architecture modifications are proven to be significant to enhance the segmentation performance. Without engaging any contextual post-processing, IFCN significantly advances the state-of-the-arts on ADE20K (ImageNet scene parsing), Pascal Context, Pascal VOC 2012 and SUN-RGBD segmentation datasets.

1.1CLDec 18, 2015

A Planning based Framework for Essay Generation

Bing Qin, Duyu Tang, Xinwei Geng et al.

Generating an article automatically with computer program is a challenging task in artificial intelligence and natural language processing. In this paper, we target at essay generation, which takes as input a topic word in mind and generates an organized article under the theme of the topic. We follow the idea of text planning \cite{Reiter1997} and develop an essay generation framework. The framework consists of three components, including topic understanding, sentence extraction and sentence reordering. For each component, we studied several statistical algorithms and empirically compared between them in terms of qualitative or quantitative analysis. Although we run experiments on Chinese corpus, the method is language independent and can be easily adapted to other language. We lay out the remaining challenges and suggest avenues for future research.