CLAug 11, 2024

Multitask Fine-Tuning and Generative Adversarial Learning for Improved Auxiliary Classification

arXiv:2408.15265v1h-index: 1
Originality Incremental advance
AI Analysis

This work addresses improving classification performance in NLP tasks, but it is incremental as it builds on existing BERT and GAN methods.

The study tackled multitask fine-tuning and generative adversarial learning for auxiliary classification, achieving 0.516 sentiment accuracy, 0.886 paraphrase accuracy, and 0.864 semantic similarity correlation, and found that a conditional generator produced embeddings with clear class correlation.

In this study, we implement a novel BERT architecture for multitask fine-tuning on three downstream tasks: sentiment classification, paraphrase detection, and semantic textual similarity prediction. Our model, Multitask BERT, incorporates layer sharing and a triplet architecture, custom sentence pair tokenization, loss pairing, and gradient surgery. Such optimizations yield a 0.516 sentiment classification accuracy, 0.886 paraphase detection accuracy, and 0.864 semantic textual similarity correlation on test data. We also apply generative adversarial learning to BERT, constructing a conditional generator model that maps from latent space to create fake embeddings in $\mathbb{R}^{768}$. These fake embeddings are concatenated with real BERT embeddings and passed into a discriminator model for auxiliary classification. Using this framework, which we refer to as AC-GAN-BERT, we conduct semi-supervised sensitivity analyses to investigate the effect of increasing amounts of unlabeled training data on AC-GAN-BERT's test accuracy. Overall, aside from implementing a high-performing multitask classification system, our novelty lies in the application of adversarial learning to construct a generator that mimics BERT. We find that the conditional generator successfully produces rich embeddings with clear spatial correlation with class labels, demonstrating avoidance of mode collapse. Our findings validate the GAN-BERT approach and point to future directions of generator-aided knowledge distillation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes