IRCLSIFeb 21, 2017

Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization

arXiv:1702.06467v1
Originality Incremental advance
AI Analysis

This work addresses the problem of author profiling in multilingual social media for applications like security or marketing, but it is incremental as it builds on existing methods with specific adaptations.

The paper tackled the author profiling task for short multilingual social network texts by applying dynamic normalization and extracting stylistic features like character and POS n-grams, achieving up to 90% performance in experiments.

In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, $n$-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character flooding, capital letters, references to other users, hyperlinks, hashtags, etc.). Experiments with SVM showed up to 90% of performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes