Xianglin Zuo

CL
3papers
345citations
Novelty52%
AI Score30

3 Papers

CLMay 7, 2022
Label-aware Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

Shining Liang, Linjun Shou, Jian Pei et al.

Despite the great success of spoken language understanding (SLU) in high-resource languages, it remains challenging in low-resource languages mainly due to the lack of labeled training data. The recent multilingual code-switching approach achieves better alignments of model representations across languages by constructing a mixed-language context in zero-shot cross-lingual SLU. However, current code-switching methods are limited to implicit alignment and disregard the inherent semantic structure in SLU, i.e., the hierarchical inclusion of utterances, slots, and words. In this paper, we propose to model the utterance-slot-word structure by a multi-level contrastive learning framework at the utterance, slot, and word levels to facilitate explicit alignment. Novel code-switching schemes are introduced to generate hard negative examples for our contrastive learning framework. Furthermore, we develop a label-aware joint model leveraging label semantics to enhance the implicit alignment and feed to contrastive learning. Our experimental results show that our proposed methods significantly improve the performance compared with the strong baselines on two zero-shot cross-lingual SLU benchmark datasets.

CLJun 1, 2021Code
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

Shining Liang, Ming Gong, Jian Pei et al.

Named entity recognition (NER) is a fundamental component in many applications, such as Web Search and Voice Assistants. Although deep neural networks greatly improve the performance of NER, due to the requirement of large amounts of training data, deep neural networks can hardly scale out to many languages in an industry setting. To tackle this challenge, cross-lingual NER transfers knowledge from a rich-resource language to languages with low resources through pre-trained multilingual language models. Instead of using training data in target languages, cross-lingual NER has to rely on only training data in source languages, and optionally adds the translated training data derived from source languages. However, the existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages, which is relatively easy to collect in industry applications. To address the opportunities and challenges, in this paper we describe our novel practice in Microsoft to leverage such large amounts of unlabeled data in target languages in real production settings. To effectively extract weak supervision signals from the unlabeled data, we develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning. The empirical study on three benchmark data sets verifies that our approach establishes the new state-of-the-art performance with clear edges. Now, the NER techniques reported in this paper are on their way to become a fundamental component for Web ranking, Entity Pane, Answers Triggering, and Question Answering in the Microsoft Bing search engine. Moreover, our techniques will also serve as part of the Spoken Language Understanding module for a commercial voice assistant. We plan to open source the code of the prototype framework after deployment.

CLAug 18, 2019
A Multi-level Neural Network for Implicit Causality Detection in Web Texts

Shining Liang, Wanli Zuo, Zhenkun Shi et al.

Mining causality from text is a complex and crucial natural language understanding task corresponding to the human cognition. Existing studies at its solution can be grouped into two primary categories: feature engineering based and neural model based methods. In this paper, we find that the former has incomplete coverage and inherent errors but provide prior knowledge; while the latter leverages context information but causal inference of which is insufficiency. To handle the limitations, we propose a novel causality detection model named MCDN to explicitly model causal reasoning process, and furthermore, to exploit the advantages of both methods. Specifically, we adopt multi-head self-attention to acquire semantic feature at word level and develop the SCRN to infer causality at segment level. To the best of our knowledge, with regards to the causality tasks, this is the first time that the Relation Network is applied. The experimental results show that: 1) the proposed approach performs prominent performance on causality detection; 2) further analysis manifests the effectiveness and robustness of MCDN.