CLJul 7, 2022

Sensitivity Analysis on Transferred Neural Architectures of BERT and GPT-2 for Financial Sentiment Analysis

arXiv:2207.03037v15 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses the sensitivity of fine-tuning for financial applications, providing incremental insights into parameter stability for practitioners.

The study investigated the fine-tuning performance and parameter sensitivity of pre-trained BERT and GPT-2 models for financial sentiment analysis, finding that BERT parameters are hypersensitive to stochasticity while GPT-2 is more stable, and that early layers in both models contain essential word pattern information.

The explosion in novel NLP word embedding and deep learning techniques has induced significant endeavors into potential applications. One of these directions is in the financial sector. Although there is a lot of work done in state-of-the-art models like GPT and BERT, there are relatively few works on how well these methods perform through fine-tuning after being pre-trained, as well as info on how sensitive their parameters are. We investigate the performance and sensitivity of transferred neural architectures from pre-trained GPT-2 and BERT models. We test the fine-tuning performance based on freezing transformer layers, batch size, and learning rate. We find the parameters of BERT are hypersensitive to stochasticity in fine-tuning and that GPT-2 is more stable in such practice. It is also clear that the earlier layers of GPT-2 and BERT contain essential word pattern information that should be maintained.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes