CLMar 16, 2019

Imbalanced multi-label classification using multi-task learning with extractive summarization

arXiv:1903.06963v10.2Has Code

Originality Incremental advance

AI Analysis

This work addresses data scarcity issues in NLP tasks for researchers and practitioners, though it is incremental as it builds on existing multi-task learning and RNN methods.

The paper tackled the problem of imbalanced multi-label classification and extractive summarization with limited training data by using multi-task learning, resulting in a 50% improvement in summarization accuracy and a 75% improvement in classification accuracy compared to RNN baselines.

Extractive summarization and imbalanced multi-label classification often require vast amounts of training data to avoid overfitting. In situations where training data is expensive to generate, leveraging information between tasks is an attractive approach to increasing the amount of available information. This paper employs multi-task training of an extractive summarizer and an RNN-based classifier to improve summarization and classification accuracy by 50% and 75%, respectively, relative to RNN baselines. We hypothesize that concatenating sentence encodings based on document and class context increases generalizability for highly variable corpuses.

View on arXiv PDF Code

Similar