VnCoreNLP: A Vietnamese Natural Language Processing Toolkit
This provides a practical tool for researchers and developers working on Vietnamese NLP, though it is incremental as it applies existing methods to a specific language.
The authors tackled the lack of a comprehensive NLP toolkit for Vietnamese by developing VnCoreNLP, a Java-based annotation pipeline that achieves state-of-the-art results in tasks like word segmentation, POS tagging, NER, and dependency parsing.
We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language processing (NLP) tasks including word segmentation, part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing, and obtains state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to provide rich linguistic annotations to facilitate research work on Vietnamese NLP. Our VnCoreNLP is open-source and available at: https://github.com/vncorenlp/VnCoreNLP