Parsing Thai Social Data: A New Challenge for Thai NLP
This addresses the problem of parsing informal language in social media for Thai NLP, which is incremental as it adapts existing methods to new data.
The paper tackled dependency parsing for informal Thai social network data, achieving a UAS of 81.42% with a transition-based model called improve Elkared dependency parser.
Dependency parsing (DP) is a task that analyzes text for syntactic structure and relationship between words. DP is widely used to improve natural language processing (NLP) applications in many languages such as English. Previous works on DP are generally applicable to formally written languages. However, they do not apply to informal languages such as the ones used in social networks. Therefore, DP has to be researched and explored with such social network data. In this paper, we explore and identify a DP model that is suitable for Thai social network data. After that, we will identify the appropriate linguistic unit as an input. The result showed that, the transition based model called, improve Elkared dependency parser outperform the others at UAS of 81.42%.