LGCLDec 13, 2024

Financial Sentiment Analysis: Leveraging Actual and Synthetic Data for Supervised Fine-tuning

arXiv:2412.09859v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the challenge of scarce labeled data for financial sentiment analysis, which is crucial for investors and analysts, but it is incremental as it builds on existing models like BERT.

The paper tackled the problem of financial sentiment analysis by fine-tuning language models with actual and synthetic data, resulting in improved accuracy and F1 scores on the Financial Phrasebank dataset at 50% and 100% agreement levels.

The Efficient Market Hypothesis (EMH) highlights the essence of financial news in stock price movement. Financial news comes in the form of corporate announcements, news titles, and other forms of digital text. The generation of insights from financial news can be done with sentiment analysis. General-purpose language models are too general for sentiment analysis in finance. Curated labeled data for fine-tuning general-purpose language models are scare, and existing fine-tuned models for sentiment analysis in finance do not capture the maximum context width. We hypothesize that using actual and synthetic data can improve performance. We introduce BertNSP-finance to concatenate shorter financial sentences into longer financial sentences, and finbert-lc to determine sentiment from digital text. The results show improved performance on the accuracy and the f1 score for the financial phrasebank data with $50\%$ and $100\%$ agreement levels.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes