LGCLJun 4, 2021

Learning Slice-Aware Representations with Mixture of Attentions

arXiv:2106.02363v1712 citations
Originality Incremental advance
AI Analysis

This work addresses the need for fine-grained model improvement on critical data slices in machine learning systems, though it is incremental as it builds on existing slice-based learning methods.

The paper tackles the problem of improving model performance on specific data subsets or slices while maintaining overall accuracy, by extending slice-based learning with a mixture of attentions to learn slice-aware representations. It shows that this approach outperforms baseline and original slice-based learning methods on monitored slices in two natural language understanding tasks.

Real-world machine learning systems are achieving remarkable performance in terms of coarse-grained metrics like overall accuracy and F-1 score. However, model improvement and development often require fine-grained modeling on individual data subsets or slices, for instance, the data slices where the models have unsatisfactory results. In practice, it gives tangible values for developing such models that can pay extra attention to critical or interested slices while retaining the original overall performance. This work extends the recent slice-based learning (SBL)~\cite{chen2019slice} with a mixture of attentions (MoA) to learn slice-aware dual attentive representations. We empirically show that the MoA approach outperforms the baseline method as well as the original SBL approach on monitored slices with two natural language understanding (NLU) tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes