Tran Binh Dang

h-index3

4papers

62citations

Novelty30%

AI Score19

Ranked #187,645 of 194,257 authors (top 97%)#30,205 in CL (top 98%)

4 Papers

0.2CLSep 11, 2021

HYDRA -- Hyper Dependency Representation Attentions

Ha-Thanh Nguyen, Vu Tran, Tran-Binh Dang et al.

Attention is all we need as long as we have enough data. Even so, it is sometimes not easy to determine how much data is enough while the models are becoming larger and larger. In this paper, we propose HYDRA heads, lightweight pretrained linguistic self-attention heads to inject knowledge into transformer models without pretraining them again. Our approach is a balanced paradigm between leaving the models to learn unsupervised and forcing them to conform to linguistic knowledge rigidly as suggested in previous studies. Our experiment proves that the approach is not only the boost performance of the model but also lightweight and architecture friendly. We empirically verify our framework on benchmark datasets to show the contribution of linguistic knowledge to a transformer model. This is a promising result for a new approach to transferring knowledge from linguistic resources into transformer-based models.

1.6CLJun 25, 2021

JNLP Team: Deep Learning Approaches for Legal Processing Tasks in COLIEE 2021

Ha-Thanh Nguyen, Phuong Minh Nguyen, Thi-Hai-Yen Vuong et al.

COLIEE is an annual competition in automatic computerized legal text processing. Automatic legal document processing is an ambitious goal, and the structure and semantics of the law are often far more complex than everyday language. In this article, we survey and report our methods and experimental results in using deep learning in legal document processing. The results show the difficulties as well as potentials in this family of approaches.

1.4CLJun 25, 2021

ParaLaw Nets -- Cross-lingual Sentence-level Pretraining for Legal Text Processing

Ha-Thanh Nguyen, Vu Tran, Phuong Minh Nguyen et al.

Ambiguity is a characteristic of natural language, which makes expression ideas flexible. However, in a domain that requires accurate statements, it becomes a barrier. Specifically, a single word can have many meanings and multiple words can have the same meaning. When translating a text into a foreign language, the translator needs to determine the exact meaning of each element in the original sentence to produce the correct translation sentence. From that observation, in this paper, we propose ParaLaw Nets, a pretrained model family using sentence-level cross-lingual information to reduce ambiguity and increase the performance in legal text processing. This approach achieved the best result in the Question Answering task of COLIEE-2021.

2.2CLNov 4, 2020

JNLP Team: Deep Learning for Legal Processing in COLIEE 2020

Ha-Thanh Nguyen, Hai-Yen Thi Vuong, Phuong Minh Nguyen et al.

We propose deep learning based methods for automatic systems of legal retrieval and legal question-answering in COLIEE 2020. These systems are all characterized by being pre-trained on large amounts of data before being finetuned for the specified tasks. This approach helps to overcome the data scarcity and achieve good performance, thus can be useful for tackling related problems in information retrieval, and decision support in the legal domain. Besides, the approach can be explored to deal with other domain specific problems.