CLMay 7, 2022
Label-aware Multi-level Contrastive Learning for Cross-lingual Spoken Language UnderstandingShining Liang, Linjun Shou, Jian Pei et al.
Despite the great success of spoken language understanding (SLU) in high-resource languages, it remains challenging in low-resource languages mainly due to the lack of labeled training data. The recent multilingual code-switching approach achieves better alignments of model representations across languages by constructing a mixed-language context in zero-shot cross-lingual SLU. However, current code-switching methods are limited to implicit alignment and disregard the inherent semantic structure in SLU, i.e., the hierarchical inclusion of utterances, slots, and words. In this paper, we propose to model the utterance-slot-word structure by a multi-level contrastive learning framework at the utterance, slot, and word levels to facilitate explicit alignment. Novel code-switching schemes are introduced to generate hard negative examples for our contrastive learning framework. Furthermore, we develop a label-aware joint model leveraging label semantics to enhance the implicit alignment and feed to contrastive learning. Our experimental results show that our proposed methods significantly improve the performance compared with the strong baselines on two zero-shot cross-lingual SLU benchmark datasets.
CLApr 21, 2023
TC-GAT: Graph Attention Network for Temporal Causality DiscoveryXiaosong Yuan, Ke Chen, Wanli Zuo et al.
The present study explores the intricacies of causal relationship extraction, a vital component in the pursuit of causality knowledge. Causality is frequently intertwined with temporal elements, as the progression from cause to effect is not instantaneous but rather ensconced in a temporal dimension. Thus, the extraction of temporal causality holds paramount significance in the field. In light of this, we propose a method for extracting causality from the text that integrates both temporal and causal relations, with a particular focus on the time aspect. To this end, we first compile a dataset that encompasses temporal relationships. Subsequently, we present a novel model, TC-GAT, which employs a graph attention mechanism to assign weights to the temporal relationships and leverages a causal knowledge graph to determine the adjacency matrix. Additionally, we implement an equilibrium mechanism to regulate the interplay between temporal and causal relations. Our experiments demonstrate that our proposed method significantly surpasses baseline models in the task of causality extraction.
CLJun 1, 2021Code
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity RecognitionShining Liang, Ming Gong, Jian Pei et al.
Named entity recognition (NER) is a fundamental component in many applications, such as Web Search and Voice Assistants. Although deep neural networks greatly improve the performance of NER, due to the requirement of large amounts of training data, deep neural networks can hardly scale out to many languages in an industry setting. To tackle this challenge, cross-lingual NER transfers knowledge from a rich-resource language to languages with low resources through pre-trained multilingual language models. Instead of using training data in target languages, cross-lingual NER has to rely on only training data in source languages, and optionally adds the translated training data derived from source languages. However, the existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages, which is relatively easy to collect in industry applications. To address the opportunities and challenges, in this paper we describe our novel practice in Microsoft to leverage such large amounts of unlabeled data in target languages in real production settings. To effectively extract weak supervision signals from the unlabeled data, we develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning. The empirical study on three benchmark data sets verifies that our approach establishes the new state-of-the-art performance with clear edges. Now, the NER techniques reported in this paper are on their way to become a fundamental component for Web ranking, Entity Pane, Answers Triggering, and Question Answering in the Microsoft Bing search engine. Moreover, our techniques will also serve as part of the Spoken Language Understanding module for a commercial voice assistant. We plan to open source the code of the prototype framework after deployment.
CVJun 3, 2018Code
Low Cost Edge Sensing for High Quality DemosaickingYan Niu, Jihong Ouyang, Wanli Zuo et al.
Digital cameras that use Color Filter Arrays (CFA) entail a demosaicking procedure to form full RGB images. As today's camera users generally require images to be viewed instantly, demosaicking algorithms for real applications must be fast. Moreover, the associated cost should be lower than the cost saved by using CFA. For this purpose, we revisit the classical Hamilton-Adams (HA) algorithm, which outperforms many sophisticated techniques in both speed and accuracy. Inspired by HA's strength and weakness, we design a very low cost edge sensing scheme. Briefly, it guides demosaicking by a logistic functional of the difference between directional variations. We extensively compare our algorithm with 28 demosaicking algorithms by running their open source codes on benchmark datasets. Compared to methods of similar computational cost, our method achieves substantially higher accuracy, Whereas compared to methods of similar accuracy, our method has significantly lower cost. Moreover, on test images of currently popular resolution, the quality of our algorithm is comparable to top performers, whereas its speed is tens of times faster.
CLNov 11, 2020
CalibreNet: Calibration Networks for Multilingual Sequence LabelingShining Liang, Linjun Shou, Jian Pei et al.
Lack of training data in low-resource languages presents huge challenges to sequence labeling tasks such as named entity recognition (NER) and machine reading comprehension (MRC). One major obstacle is the errors on the boundary of predicted answers. To tackle this problem, we propose CalibreNet, which predicts answers in two steps. In the first step, any existing sequence labeling method can be adopted as a base model to generate an initial answer. In the second step, CalibreNet refines the boundary of the initial answer. To tackle the challenge of lack of training data in low-resource languages, we dedicatedly develop a novel unsupervised phrase boundary recovery pre-training task to enhance the multilingual boundary detection capability of CalibreNet. Experiments on two cross-lingual benchmark datasets show that the proposed approach achieves SOTA results on zero-shot cross-lingual NER and MRC tasks.
LGJan 6, 2020
A Block-based Generative Model for Attributed Networks EmbeddingXueyan Liu, Bo Yang, Wenzhuo Song et al.
Attributed network embedding has attracted plenty of interest in recent years. It aims to learn task-independent, low-dimensional, and continuous vectors for nodes preserving both topology and attribute information. Most of the existing methods, such as random-walk based methods and GCNs, mainly focus on the local information, i.e., the attributes of the neighbours. Thus, they have been well studied for assortative networks (i.e., networks with communities) but ignored disassortative networks (i.e., networks with multipartite, hubs, and hybrid structures), which are common in the real world. To enable model both assortative and disassortative networks, we propose a block-based generative model for attributed network embedding from a probability perspective. Specifically, the nodes are assigned to several blocks wherein the nodes in the same block share the similar linkage patterns. These patterns can define assortative networks containing communities or disassortative networks with the multipartite, hub, or any hybrid structures. To preserve the attribute information, we assume that each node has a hidden embedding related to its assigned block. We use a neural network to characterize the nonlinearity between node embeddings and node attributes. We perform extensive experiments on real-world and synthetic attributed networks. The results show that our proposed method consistently outperforms state-of-the-art embedding methods for both clustering and classification tasks, especially on disassortative networks.
CLAug 18, 2019
A Multi-level Neural Network for Implicit Causality Detection in Web TextsShining Liang, Wanli Zuo, Zhenkun Shi et al.
Mining causality from text is a complex and crucial natural language understanding task corresponding to the human cognition. Existing studies at its solution can be grouped into two primary categories: feature engineering based and neural model based methods. In this paper, we find that the former has incomplete coverage and inherent errors but provide prior knowledge; while the latter leverages context information but causal inference of which is insufficiency. To handle the limitations, we propose a novel causality detection model named MCDN to explicitly model causal reasoning process, and furthermore, to exploit the advantages of both methods. Specifically, we adopt multi-head self-attention to acquire semantic feature at word level and develop the SCRN to infer causality at segment level. To the best of our knowledge, with regards to the causality tasks, this is the first time that the Relation Network is applied. The experimental results show that: 1) the proposed approach performs prominent performance on causality detection; 2) further analysis manifests the effectiveness and robustness of MCDN.