LG SINov 16, 2022

Few-shot Learning for Multi-modal Social Media Event Filtering

José Nascimento, João Phillipe Cardenuto, Jing Yang, Anderson Rocha

arXiv:2211.10340v13.34 citationsh-index: 58Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of event filtering in social media for researchers and analysts in scenarios with limited labeled data, representing an incremental improvement over existing supervised methods.

The paper tackles the problem of filtering noisy social media data for event analysis by proposing a graph-based few-shot learning pipeline, achieving comparable performance with only 60 labeled samples compared to using 3,100 fully labeled samples.

Social media has become an important data source for event analysis. When collecting this type of data, most contain no useful information to a target event. Thus, it is essential to filter out those noisy data at the earliest opportunity for a human expert to perform further inspection. Most existing solutions for event filtering rely on fully supervised methods for training. However, in many real-world scenarios, having access to large number of labeled samples is not possible. To deal with a few labeled sample training problem for event filtering, we propose a graph-based few-shot learning pipeline. We also release the Brazilian Protest Dataset to test our method. To the best of our knowledge, this dataset is the first of its kind in event filtering that focuses on protests in multi-modal social media data, with most of the text in Portuguese. Our experimental results show that our proposed pipeline has comparable performance with only a few labeled samples (60) compared with a fully labeled dataset (3100). To facilitate the research community, we make our dataset and code available at https://github.com/jdnascim/7Set-AL.

View on arXiv PDF Code

Similar