CLNov 28, 2017

Vietnamese Semantic Role Labelling

arXiv:1711.10124v16 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the lack of SRL resources for Vietnamese, providing a foundational baseline for future research in Vietnamese natural language processing.

The paper tackles semantic role labeling (SRL) for Vietnamese by building the first Vietnamese PropBank corpus and developing a software system, achieving an F1 score of 74.77% through a novel constituent extraction algorithm and integration of distributed word features with statistical classifiers and integer linear programming.

In this paper, we study semantic role labelling (SRL), a subtask of semantic parsing of natural language sentences and its application for the Vietnamese language. We present our effort in building Vietnamese PropBank, the first Vietnamese SRL corpus and a software system for labelling semantic roles of Vietnamese texts. In particular, we present a novel constituent extraction algorithm in the argument candidate identification step which is more suitable and more accurate than the common node-mapping method. In the machine learning part, our system integrates distributed word features produced by two recent unsupervised learning models in two learned statistical classifiers and makes use of integer linear programming inference procedure to improve the accuracy. The system is evaluated in a series of experiments and achieves a good result, an $F_1$ score of 74.77%. Our system, including corpus and software, is available as an open source project for free research and we believe that it is a good baseline for the development of future Vietnamese SRL systems.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes