CLLGMar 14, 2023

Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers

arXiv:2303.07991v11 citationsh-index: 27
Originality Incremental advance
AI Analysis

This addresses the challenge of extracting meaningful rationales from long text classifiers for researchers and practitioners in NLP, though it appears to be an incremental improvement over existing methods.

The paper tackled the problem of poor token-level prediction quality in long-form transformer models for document classification, finding that standard soft attention methods performed significantly worse with Longformer language models. The researchers proposed a compositional soft attention architecture using RoBERTa sentence-wise, which significantly outperformed Longformer baselines on sentiment classification datasets with lower runtimes.

Long-sequence transformers are designed to improve the representation of longer texts by language models and their performance on downstream document-level tasks. However, not much is understood about the quality of token-level predictions in long-form models. We investigate the performance of such architectures in the context of document classification with unsupervised rationale extraction. We find standard soft attention methods to perform significantly worse when combined with the Longformer language model. We propose a compositional soft attention architecture that applies RoBERTa sentence-wise to extract plausible rationales at the token-level. We find this method to significantly outperform Longformer-driven baselines on sentiment classification datasets, while also exhibiting significantly lower runtimes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes